


ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


THE CLOSURE OPERATORS OF A LATTICE 


By Morean Warp 


(Received January 29, 1940) 


I. INTRODUCTION 


1. If S is a lattice of elements A, B, --- , the class of all operators of S (that 
is, one-valued functions ¢X = ¢(X) on S to S) may be made into a lattice by 
defining the union 6 and cross-cut x of any set ® of operators ¢ by’ 


OX = (0 GX ++), aK = [GX], Gee, 


The union and cross-cut here are taken over all the values ¢X of the operators 
in ® for any given X of ©. 

It is easily verified that the operators of S form a lattice in which ¢ > y if 
and only if ¢X DX for every X of S; furthermore this lattice is closed, modular, 
or distributive according as © is closed, modular or distributive.’ 

The operator lattice of a lattice is a concept comparable in generality to the 
Boolean algebra of all subsets of a lattice. As in the algebra, it is certain 
distinguished sets of operators which are useful in investigating the given lat- 
tice rather than the operator lattice itself. 

One obviously important distinguished type is the linear operator. An 
operator ¢ is said to be linear if for any subset 2 of elements A of ©, it has one 
or more of the four properties 


(i) $(--- A+++) = (+++ gA==>), (iii) g[--- A ---] =[--- A -<*], 
(ii) o(--- A+++) =[--- oA -**], (iv) @[---A-+:] =(---gA-=:). 


Here the unions and cross-cuts are taken over all the elements of Y, and is 
finite if S is not closed. Lattice homomorphisms and homomorphisms with 
respect to union with properties (i), (iii) and (i) respectively are familiar ex- 
amples. (Ore 1). 

The linear operators and certain associated lattices are important in the 
study of residuated lattices (Ward-Dilworth 1) as I plan to show in detail 
elsewhere.* 


1 If S is not closed, @ is assumed to contain only a finite number of operators. A lattice 
is said to be closed (or “‘complete’”’ or ‘‘continuous’’) if it contains the union and cross-cut 
of any subset of elements in it. 

? Chain conditions in S do not usually carry over to the operator-lattice. 

* The product ¢y of two operators ¢ and y defined by ¢¥X = ¢(¥(X)) immediately gives 
us an associative multiplication over the operator lattice. On the other hand if B is any 
fixed element of a residuated lattice ©, the operators » and p defined by uX = BX, pX = 
B:X have the linear properties u(--- A --+) = (++: wA <+:), p(-*+ Aves) = [+++ pA oe]. 


191. 


<a e ha 


Pai ONT am ~ 


- POO, UTE Ce he £ 
é 








192 MORGAN WARD 


I have discussed elsewhere (Ward 2) a type of operator associated with a 
point lattice,‘ which may be used to classify all such lattices of finite order. 


2. I develop here the properties of a type of operator which is of fundamental 
importance in the study of certain imbedding problems of ring theory and 
semi-group theory.” A typical problem of this class is to imbed a system I of 
elements over which a commutative and associative multiplication is defined in 
a residuated lattice G so as to preserve the multiplication in I and thus to 
study the arithmetical properties of I. (Clifford 2, Ward-Dilworth 2). The 
imbedding is effected by defining a suitable type of ‘‘ideal’’ (distinguished sub- 
set) of I in the Boolean algebra of its subsets.’ A closely related problem is to 
imbed a semi-ordered set in a closed lattice. (Mac Neille 1). 

The ‘closure operators” introduced here enable us to view all these problems 

‘from a unified standpoint, and explain why in all extant theories of ideals as 
distinguished subsets, the cross-cut of two ideals is the set-theoretic cross-cut 
of their elements. 


II. CLosuRE OPERATORS 


3. Let S be a closed lattice. An operator ¢ of © is said to be a closure operator 
if it satisfies the following three conditions:' 

I1. A DB implies that ¢A > OB. 

IZ. @é2«. 

13. ¢ =¢. 

Here : is the identity operator leaving every element of S unchanged. 

If T is any set of elements 7' of S, it may be proved that every closure operator 
¢ has the quasi-linear properties 


(3.1) o[--: oT +++] = [--- 67 +++], 
(3.2) o(--- T +--+) = o(--- @T ---), T eS. 


No actual linearity is assumed. 


THroreM 3.1. The cross-cut® of any set of closure operators is again a closure 
operator. 





‘ A lattice is called a point lattice if every element in it save the null element is a union 
of points. Here a point is any element covering the null element. Point lattices include 
important types of projective geometries, exchange lattices, and Boolean algebras. 

* For a discussion of these problems, the reader is referred to Clifford 1, 2 where references 
are given to the work of Priifer and others. 

6 Several definitions are usually possible. See Ward-Dilworth 2. 

7 These axioms are satisfied by Kuratowski’s closure operator over a Boolean algebra 
with points. (Kuratowski 1). But they are essentially weaker, as Kuratowski’s operator 
is linear with respect to union. Compare also Birkhoff 1. 

* In general, no closure properties hold for the union and product of (closure) operators. 
It may be shown that if ¢ andy are operators, then (¢,y) is an operator if and only if (¢, ¥) = 
oy =¥¢. Commutativity is thus a necessary condition for the union (¢,¥) to be an operator. 
It is evidently a sufficient condition for the product ¢y to be an operator. 








ha 


ital 
ind 
of 
in 

to 
‘he 
ab- 


ms 
as 
cut 


tor 


Arve 


on 


do 


CLOSURE OPERATORS OF A LATTICE 193 


Proor. Let be a set of closure operators ¢, and let x = [--- @---] be their 
cross-cut. We shall show that « satisfies I 1, I 2, 13. 

11 is satisfied. For A DB implies ¢A DB for every ¢€%. Hence 
[---¢4 ---] D[--- @B-- -], «A 2D «B. 12 is satisfied. For since ¢A D A 
for every @€®, [---¢4 ---] DA or k Du. 18 is satisfied. For K°A = 
[--- GA --+], be®. Now ¢2x. Hence ¢A D «A, ¢A DP oxA, 6A D oA. 
Accordingly x D«. ByIlandI2,«°D«. Hencex’ = x, completing the proof. 


4, Let @ be a given closure operator, and let S’ = ¢© be the set of all its 
values X’= @X inS. By formula (3.1) any subset T’ of the X’ is closed under 
cross-cut. We may express this fact by writing 


(4.1) [--- T’ -+e]er = [e+ T’ +e Jo, , a. t’ co’. 


If J is the unit element of S, then I’ = I divides all elements A’ of S’. Hence 
for any subset %’ of elements L’ of S’, the class R’ of all K’ such that K’ D L’ is 
non-empty. We define the union of the L’ to be the cross-cut of the K’: 


(4.2) (+++ L! -)e = [--- K’ reele, . i? = oe every L’ of ge’, 


We obtain by a familiar argument: 

THEOREM 4.1. The set S’ of values of a given closure operator forms a closed 
lattice within S with respect to the operations of union and cross-cut defined by 
(4.2) and (4.1). 

To each closure operator # we may accordingly assign a lattice S’ = @¢S. In 
particular, © = eS. We shall establish a converse result. 

Let S’ now denote a fixed subset of © closed under cross-cut and containing 
the unit element J. We make ©’ into a lattice within © by assigning to any 
subset %” of elements of G’ as in (4.2) a union defined as the cross-cut of the 
set of all multiples of the elements of {’. 

We next define an operator ¢ on S to GS’ as follows: If A is any element of 
S, then ¢A is the cross-cut of all elements B’ of S’ such that B’ D> A. Then 
¢ is a closure operator, for I 1, I 2, 13 are evidently satisfied. Furthermore, 
oS = ©’. 

We have thus established a one-to-one correspondence between the closure 
operators of S and subsets of S closed under cross-cut and containing I. The 
lattice S’ = ¢S and the operator ¢ will be said to belong to one another. 

It is also easily proved from formula (3.2) that S’ is a sublattice of S if and 
only if the closure operator belonging to ©’ is linear with respect to union. 

THEoreM 4.2. The closure operators of any closed lattice themselves form a 
lattice within the operator lattice of S. 

Proor. Let = denote the set of all closure operators of S. By formula 
(3.1), the cross-cut of any set of such operators is again a closure operator. 
Furthermore the operator w defined by wA = I, every A of ©, is obviously a 
closure operator dividing every other closure operator. Hence we may define 
the union of any set ® of such operators as the cross-cut of the non-empty set 
of closure operators containing every operator of ©. 





A SPLAT BRU 
a . 


- ean. 





194 MORGAN WARD 


We may evidently define lattice operations on the set of all subsets S’ of © 
closed under cross-cut and containing J by the rules 


[--- G’---] = [------]S, ¢ €®, S’ = 6 CHS 


(4.3) a as 
(«++ Gl +++) = (+++ G+++)G. 
The lattices G’ thus form a lattice simply isomorphic with the lattice > of 
closure operators. We shall return to these operations at the close of the next 
section. 


5. Consider an operator ¢ belonging to a set consisting of two elements J 
and T of ©. It follows from the previous theorems that ¢ is characterized by 


(1) ¢d4=I1 if THA, GA=T if TDA, A any element of G. 


We call ¢@ the two-valued operator belonging to T. Since ¢© is a sub-lattice 
of S, ¢ is linear with respect to union, as is directly evident from (5.1). It is 
easy to prove 

THEOREM 5.1. The ideal operator o belonging to any set S’ of elements of S 
which is closed under cross-cut and contains I is the operator cross-cut of all the 
two-valued operators belonging to elements of S’. 

If T is any set of elements T of S containing J, we obtain a lattice S’ within 
S containing T by adjoining to T the cross-cuts of all sets of its elements. ©’ 
is evidently the smallest such lattice containing T. We call GS’ the imbedding 
lattice of T, and its corresponding closure operator the “imbedding operator” 
of ©. We shall use the letter @ to denote an imbedding operator. 

THEOREM 5.2. If 6 is the imbedding operator of a set T= of elements of S con- 
taining I, then the value of 6 for any element A of S is given by the formula 


(5.2) OA=[---T---]| TDA, Tel. 


Proor. We have 6A = S’ where S’ lies in S’ = 6S. Hence S’ is the cross- 
cut of a certain set of the Tin T. Now since 6A > A, every such T divides A. 
But since 6T = T if Te ZT, T D A implies that T D 6A = S’. Hence (5.2) 
follows. 

THEOREM 5.3. Let ¢ and y be any two closure operators of S. Then Dy 
af and only if the lattice belonging to y contains the lattice belonging to ¢ in the set- 
theoretic sense. 

Proor. Assume that¢ Dy and let A e¢S. ThengA = A. ByI1,¢A 25 
WA. HenceA DYA. Therefore by 12,A =A or A eW~S. Since ¢ and ware 
the imbedding operators of their respective lattices, the converse follows from 
Theorem 5.2. 

The following corollaries are immediate: 

Corotiary 5.31. Let @ be the imbedding operator belonging to any set & of 
elements of S containing I, and let y be any closure operator such that w leaves 
every element of & invariant. Then w divides @. 








y of 
lext 


hin 
So’ 
ing 


yr” 


N- 


of 


e8 





CLOSURE OPERATORS OF A LATTICE 195 


CorotLaRy 5.32. The imbedding operator of any set is the union of all closure 
operators which leave every element of the set invariant. 

CoroLiary 5.33. The union operation on the lattices which belong to closure 
operators defined by (4.3) ts the operation of taking the set-theoretic cross-cut of 
their elements. 

It is this correspondence between operator union and set-theoretic cross-cut 
which makes the ideal operators of importance in imbedding problems. 


III. APPLICATIONS TO IMBEDDING PROBLEMS 


6. Let J be a set of elements a, b, --- semi-ordered with respect to a division 
relation x | y and containing a unit element . dividing every other element. 
The following problem has been considered by Mac Neille: (Mac Neille 1). To 
construct a closed lattice S’ such that: (i) ©’ contains a subset of elements 
A’, B’, --- which may be set in a one-to-one correspondence x <> X’ with a, b, -- +; 
(ii) If a<> A’ and b = B’, then 


(6.1) a|binI implies A’ D> B’ inG’. 
(6.2) A’>B’in©S’ implies a|binI. 


We call such a construction an “isomorphic imbedding”’ of the set J. If we 
do not require (6.2), we speak of a “homomorphic imbedding”’ of J. 

We shall solve these problems by determining suitable ideal operators in the 
lattice 8 (Boolean algebra) of all subsets of J. In other words, we shall de- 
termine all ideal operators @ of 8 such that S’ = ¢% will be a suitable lattice. 

Consider first the condition (6.1). Let 7 = (t) be a subset of J consisting of 
the single element ¢. Since 7’ = ¢7 DT, we must havete¢@7’. But by (6.1), 
if t|y in I, 67 D o(y). Hence if t| y, yed7. Thus (6.1) implies that oT 
must contain all elements y of I such that t | y. 

For a homomorphic imbedding, no further conditions are imposed on the 
values of #7. But if the imbedding is isomorphic and ¢7 > ¢(X), then (6.2) 
requires that t|z. Hence $7’ must consist only of elements x of I such that t | x. 

We let T denote the set of all T’ = $(t), te I for any ideal operator ¢. We 
call the elements of & the principal ideals of J. 

It is evident from the preceding section that any ideal operator of 8 leaving 
every element of T invariant will solve our initial imbedding problem, and that 
the simplest of these operators is the imbedding operator of the set T itself; 
for its lattice 68 is the smallest lattice in the set-theoretic sense in which the 
imbedding can be made in 8. The isomorphism between J and & with respect 
to division shows that this same minimal property of 6% will apply to any iso- 
morphic imbedding of J in any closed lattice S’ whatever; within the lattice 
©’ there must lie a lattice simply isomorphic to 6B. 6% is the lattice defined in 
Mac Neille 1 by “Dedekind cuts.” 

A similar situation occurs for homomorphic imbeddings. For a homomorphic 
imbedding, the “principal ideals” A’, B’, --- which make up the set & are not 





ap eel STFS ermine 


O° ay EO 
— 


196 MORGAN WARD 


uniquely determined by the corresponding elements a, b, --- of J; for if a 4’ 
A’ may contain elements of J not divisible by a. But once the set TF of principal 
ideals is chosen, the imbedding operator of I gives the smallest lattice in which 
the particular homomorphic imbedding can be performed. 


7. If A is any subset of J, let A be the subset of all elements / such that 
1| k for every k in A, and let A’ be the subset of all elements a such that || a 
for every lin A. Then the operator 


(7.1) A' = 0A 


is the isomorphic imbedding operator of the set J discussed above. This result 
follows easily from Theorem 5.3. For a detailed discussion, the reader may 
consult Ward-Dilworth 2 or Clifford 1, to whom this definition of @ is originally 
due.” 


CALIFORNIA INSTITUTE OF TECHNOLOGY. 


REFERENCES 


Duke Math, Journal, 3 (1931) pp. 443-454. 

Bull. Am. Math. Soc. 40 (1934) pp. 326-330. 

These Annals (2) 39 (1938) pp. 594-610. 

Topologie 1, Warsaw (1933). 

Trans. Am. Math. Soc. 42 (1937) pp. 416-460. 
These Annals, (2) 36 (1935) pp. 406-437. 

Trans. Am. Math. Soc. vol. 45 (1939), pp. 335-354. 
Unpublished. 


GaRRETT BIRKHOFF 
A. H. Cuirrorp 


C. KuRATOWSKI 

H. M. Mac NEILLE 

O. ORE 

M. Warp AND R. P. DiLwortH 
M. Warp 


1 
1 
2 
1 
1 
1 
1 
2 





® The identity of this operator and Mac Neille’s operator was pointed out to me by Dr. 
A. H. Clifford in a letter. The definition (7.1) is used in Ward-Dilworth 2 to imbed any 
ovum (semi-group) in a residuated lattice of ideals. 








pal 
ich 


hat 


ult 


lay 
lly 


Dr. 
ny 


ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


UNSTABLE MINIMAL SURFACES WITH SEVERAL BOUNDARIES! 


By Max SHIFFMAN 
(Received August 21, 1940; revised October 1, 1941) 


TABLE OF CONTENTS 


RE RES HOLL LAS PI eae a 197 


Part I. DEGENERATE Domains. THE Spaces J, AND §. 
Tue Funcrionat £[r] 


1. The Domains of Representation...................0 00 cece cece eeeee 198 
a a i sd leala 5 Wd ws bid a Ride 201 
Oe es oie ane cae a aed ewe daad 204 
4. The Connectivity Numbers of Jy... .. 0.0.0... ec eee cece eee 206 
Ro. ee ter sai ity ar Sera APSR aE tn enue ar nya OE 207 
6. The Continuity of the Functional E[r].......................0..2-. 208 
Part II. AppiLicaTtion TO UNSTABLE MINIMAL SURFACES 
SR i a vctapbenekansaakaneeeeiwea~e 211 
ERE ea er eee eS 213 
9. The Case of Polygonal Boundaries.....................0.0eee cece 215 
Oe Te i ee cade cu eke e ads oesadiéenduceuees 217 
i 221 
NS ren oOe he ened ced ke Sie 64 os aden dew Dee Sew eS 222 
INTRODUCTION 


In recent years, the question of unstable minimal surfaces bounded by a 
single contour has been attacked with success.” It has been shown that the 
Morse critical point theory applies to minimal surfaces bounded by the contour I 
(provided T satisfies certain restrictions). The contributions to the theory 
have been made by the author [11], [12], by Morse and Tompkins [7], [8] and 
by Courant [3]. 

In the present paper we shall show how to extend the theory to cover minimal 
surfaces of genus zero bounded by an arbitrary number k of non-intersecting 
contours T;, T2,---,I%. Ina general but not precise way, the main result 
may be stated thus: if the Morse theory can be shown to apply to each of the 
contours T,, --- , I individually, it applies to all the contours Ti,--- , Tx 
together. 





‘Presented to the American Mathematical Society, April 26, 1940. 

* Concerning the Plateau problem and minimal surfaces in general, see [1], [2], [4], [5], 
[10]. Numbers in brackets refer to the bibliography at the end. 

* In the meantime a paper by Morse and Tompkins [9] has appeared which considers the 
case of two boundaries. 


197 





rh 


Pah J 





198 MAX SHIFFMAN 


A first step in the development of the theory is the following (see [12], [7]). 
Let $ denote the space of admissible potential surfaces r(u, v) with a finite 


Dirichlet integral D[r], where D[r] = ‘ / (x, + x3) du dv; and let By denote 


the subspace of 2 for which D[r] < N. The first step, used even in the mini- 
mum theory, is that $y be compact. 

For the case of several boundaries, this requirement is not satisfied for ordinary 
potential surfaces. To retain the compactness of Py it is necessary to introduce 
degenerate potential surfaces, consisting of several pieces, and their corre- 
sponding degenerate parameter domains. These degenerate domains and sur- 
faces have been so selected that the discontinuities of the Dirichlet integral can 
be reduced to its discontinuities for single boundaries. More precisely, suppose 
that r is an admissible potential surface defined over a domain G having k 
boundaries. Let G, be the region of the plane bounded by the u“ boundary 
curve C, and intersecting G, and {, the potential surface defined over G, with 
boundary values identical to ron C,. Set 


Elz] = Delx] — dX Da, lil. 


Then the degenerate domains and surfaces introduced here are such that E[r] 
is a continuous functional. 

These two theorems, the compactness of $y and the continuity of E[r], form 
the backbone of this paper. In Part II, the continuity of E[r] yields very 
simply the reducibility property of D[r] provided that reducibility is known 
for simple contours. 

For two boundaries, the theory can be developed much more simply. But 
the decisive step is the case k > 2. 


Part Il. DrGenerRATE Domains. THE SPACES 9; AND . 
THe Functiona £[r] 


1. The Domains of Representation 


Let Ti, T2,--- , Ty be & non-intersecting closed Jordan curves in space. On 
each I’, select three distinct points P,, Q,, R,. We shall consider potential 
surfaces x(u, v), defined over certain normal domains of the w = u + w plane, 
which map the boundaries of these domains continuously and monotonically 
into Ste I: , coe 5 Bes 

Choose, for the ordinary (i.e., non-degenerate) domains of representation in 
the w-plane, circular regions G consisting of an annular ring, unit circle as outer 
boundary, with k — 2 additional circular holes. Let C, denote the circular 
boundary of G mapped by r(u, v) into T,, and let p,, q,, 7, be three distinct 
points of C, mapped by r(u, v) into P, , Q,, R, respectively. These specified 
points py, q., 7%," = 1, 2,--- ,k, will be considered as part of the domain of 
representation G. In considering the class of admissible potential surfaces 








r| 


it 





UNSTABLE MINIMAL SURFACES 199 


r(u, v), not only do the boundary values of r(u, v) vary but also the domains of 
representation (including the points p, , qu , Tu). 

It is possible to normalize G, by performing linear transformations and reflec- 
tions, in such a way that the following conditions are satisfied: 

(a) C1 is the unit circle; 

(b) C2 is concentric with the unit circle; 

(c) pr, G1, 7% lie in counterclockwise order around C; ; and 

(d) for k = 2, p, is at the point w = 1 of the w-plane, while for k > 2 the 
center of C3 is on the positive real axis. 

We shall suppose throughout that G has been normalized in this way. 

It is necessary to define the limit of a set of domains, and for compactness to 
introduce degenerate domains. Our procedure is motivated by considering a 
sequence r’(u, v) of admissible potential surfaces with uniformly bounded 
Dirichlet functional D[r"] < N. Let G” be the ordinary domains over which 
the r" are defined. Then obtain a limit domain G and a limit potential surface 
r such that all pieces which contribute a non-vanishing term to the Dirichlet 
integral are retained. 

The only possibilities for the degeneracy of the sequence r” are:* the circles 
of G" degenerate as n —> ©; and the boundary values of r” are not equicon- 
tinuous. Now, the non-equicontinuity of the boundary values of r” on C, is 
equivalent to the assertion that the minimum distance between the three points 
DP, » 4. » , has zero as a limiting value asn — ©. Since p,, q, 7, are con- 
sidered as part of the domain G, all degeneracies have been transferred to the 
domain. This limits the discussion of the sequence rx", so far as degeneration 
is concerned, to that of the domains G”. 

Accordingly, let G" be a sequence of ordinary domains, and (by choosing a 
subsequence if necessary) suppose that p,; , g; , 7, converge to definite points 
of the w-plane as m — ©. The sequence has no limit domain if any pair of 
circles with radii above a positive bound approach each other as n — © (this 
would contradict D[r”] < N), and we suppose that this is not the case. The 
G"’s are said to tend to degeneracy if the 3k limit points of p; , qi: , 7 asn— © 
are not all distinct. 

If the domains G" do not tend to degeneracy as n — ~, the limit domain G* 
is immediate: G® is the circular domain whose points py , g; , 7, and circles C™ 
are the limits of those of G”. 

If the sequence G” does tend to degeneracy, one or several of the three fol- 
lowing types of degeneration must occur (for convenience a circle of G" with 
radius approaching 0 as n — © is called a small circle; one whose radius does 
not approach 0, a large circle): (a) One or more small circles tend to a point A 
away from all other circles; (b) One or more small circles and a large circle C; 
approach a point A, while at most one of the points p; , g; , 7, on C; approach 
A as n — «; (c) At least two of the points p) , g> , r; on a large circle C; 





‘ For all the properties of the sequence r used in the next two pages, see [1]. 


Gee EP" 


eee. Shwe VRS. Xt ~ 
realise * 





a eS ae te Pn al mt ae 


200 MAX SHIFFMAN 


approach a point A. In cases (b) and (c) let A” be a point on the large circle 
C? approaching A; for (a), let A” be A. 

In any of these three situations, describe in G” a circle or arc of circle K" 
with A” as center and with the radius (p”)', where p” is the maximum distance 
from A" to the small circles and points p; , g? , 7, which may be approaching A. 
(It can be shown that the oscillation of z” on K” tends to 0.) K” splits G" 
into two regions Gj and G , large and small respectively. Also, in cases (b) 
and (c), K” divides C} into two ares a; and B; , where a, contains at least two 
of the points p) , q? , 7) and 8} at most one of these points. The arc a) is to 
be counted in the limit as the circle C, ; and 8? in the limit as coordinated to a 
single point of C, (since the oscillation of r" on 8; approaches 0 as n > ~), 
Separate Gi and Gz and normalize them both by performing linear transforma- 
tions and inversions. In this normalization, disregard the K"; temporarily, 
Gi and G2 are considered as bounded by circles. 

Continue as above with the regions Gj and G? separately, but with the 
following provisos: K" is to be disregarded, and 8 is to be disregarded if it 
approaches a point away from ali other circles. One finally obtains the decom- 
position of G” (a subsequence of the original G”’s) into several regions Hj , 
Hz ,--- , Hi each of which degenerates no further. A passage to the limit 
yields a degenerate domain G consisting of several distinct circular regions 
H,, H2,--- , Hi; G is called the limit of this final subsequence G". (A subse- 
quence of the r” can then be chosen to converge uniformly to a limit potential 
surface r~ defined over the various regions which compose G™.) 

We are now in a position to completely define degenerate domains. A 
degenerate domain G consists of several distinct circular regions, each supposed 
normalized. In all the regions together there are exactly k circles of type C, , 
uw = 1, 2,---,k, called boundary circles, and on each C, three distinct points 
Pus > 7; there may be additional circles, denoted by c, , C, , ++ and called 
point circles, which are coordinated to single points on C,, likewise marked 
C,, ¢,,°**. (The limit potential surface r” would take constant values on 
each point circle c, equal to its value at the corresponding point c, on C, .) The 
domain is of genus zero, i.e., if the point circles c, , c, , --- are attached to the 
corresponding points c, , C, ,°*: , then every closed curve disconnects the re- 
sulting surface. Finally there is no region with a single boundary and that 
boundary a point circle c, (in such a region r(u, v) would be identically constant). 

A limit of a sequence of degenerate domains is obtained as previously, con- 
sidering each region separately and taking the circles and points ¢, , Ch eos 
into account. 

Finally, a domain G is a limit of a set X of domains, degenerate or ordinary, 
if X contains a sequence having G as a limit in the above sense. The totality 
of domains, ordinary and degenerate, together with this definition of limit will 
be denoted by ®. , seems to be the natural totality of domains of represen- 
tation for potential functions. 


Figures 1 and 2 illustrate the various kinds of domains with two boundaries. 








> 


d 


1) 


ts 


we 


UNSTABLE MINIMAL SURFACES 201 


The degenerate domain in fig. 2(a) is the limit of a sequence of ordinary domains 
(fig. 1) whose boundary values are not equicontinuous on C; ; fig. 2(b) is the 
limit of a sequence of ordinary domains whose boundary values are not equi- 
continuous on C; ; fig. 2(¢), whose boundary values not equicontinuous on both 
(, and C2 ; and fig. 2(d), for which the radius of C, approaches zero. For more 
than two boundaries the degenerate domains become much more numerous and 
complicated. 

It is necessary to know that 9; is a topologic space, and more, that it is metric. 
This will be established by imbedding ®t; into an Euclidean space of sufficiently 
high dimension. We begin with the case of two boundaries. 


Yi 





C, 





aei® = (©, pi, qi, M1) 
be = (0, ps, qu, T2) 
‘ pei? aad (o, 0, Pi, p.) 


Figure 1. Ordinary domains in ® 


2. A Metric for 3, 


Consider first the ordinary domains in Sz (the case of two boundaries) as in 
fig. 1. The domains depend on 6 variable parameters which we will choose 
below so as to vary continuously with the domain. The principal difficulty in 
this choice is caused by the degenerate domains. In the passage to a limit from 
a sequence of ordinary domains to a degenerate domain, many linear transforma- 
tions and reflections were performed; and the degenerate domains may possibly 
contain points c, corresponding to point circles c,. The parameters must be 
chosen to take these two possibilities into account. 

The following six parameters a, a, b, B, p, ¢ are selected (angles are measured 
in the interval from —-z to 7, excluding —7): 


ig ga pair O0<a<x,0<a<n), 


(1) (0, p2, G2, 72) = be O0<b< ~,B #0,7), 


\(20, 0, m1, D2) = pe (0 <p <1), 





e+ eae 








202 MAX SHIFFMAN 


- Wi — Wy We — W4 
where (w:, We, Ws, Ws) means the cross ratio 1-3. " 
Wi — Ws We — Ws 


is chosen so as to obtain invariance under linear transformations, and the 





Cross ratio 





2a Ps ae = (@, pi, Gi, T) 
be#® = (2, P2, q:, r2) 
c. p as in diagram 
4 


C. 
feats 
2b , LS ae'* » (cx, Pr, Qu, M1) 


7 Pu bei? = (0, py, qs, 1») 
‘ LY p as in diagram 
Sn " 
a 


aei* = (¢,, pi, qh, r) 
be® = (ee, Pz, 2, T2) 
p as in diagram 


2c 





2d 





Figure 2. Degenerate domains in Re 


points ©, 0 correspond to points of type ci, cz in degenerate domains. The 
precise reason for the choice of the points ©, 0 in the cross ratio expressions is 
made apparent in the proof of Lemma 1 below, part (b). 


(2) 


If a 
for | 
ma} 
don 


dor 
diffe 
is es 
x-d 


the 
spor 
sequ 








‘lo 
he 





UNSTABLE MINIMAL SURFACES 203 


The geometric meanings of a, 8, p, ¢ are indicated in fig. 1, while a = pan 
Pid 


It is easily established geometrically that the parameters a, a, b, 








and b = =. 
P2q2 
8, p, ¢ Subject to the inequalities in (1) (these inequalities are necessary for the 


non-degeneracy and normalization of the domain) uniquely determine the 
domain. 

For degenerate domains, the parameters a, a, b, 8, p, ¢ are defined as in fig. 2. 
The undefined parameters may have any value (e.g., in fig. 2(a), ¢ may have 
any value). For a degenerate domain, note that either ae'* is real (so that 
a = Oorz, ora = Oor ~), or be" is real, or ¢ = 0. 

A remaining difficulty is the indeterminateness of some of the parameters in 
the case of degenerate domains. A method for eliminating this difficulty is to 
introduce a set of new parameters which, when defined in terms of a, a, b, B, p, ¢, 
contain factors vanishing for degenerate domains. This of course necessitates 
introducing more than 6 new parameters. One such possibility is the setof 
nine parameters z,, v = 1, --- ,9, interpreted as the coordinates of a point x 
in Euclidean space ©, , defined by 


( 





a we wh lt te ee 
1 1 — p’ 2 Pita’ 

b ‘ a ia 
atte <  ieits «Sl 


(2) 





° ie b i|8| . a *) 
ta + tty = pe (¢ Temes * . 








| + iz9 = psina sin Bes z a] + Re 

Ifa = ©,a/(1 + a) and a/(1 + a’) mean 1 and 0 respectively; and similarly 
for b. The complicated expression for xs + 727 is necessary because reflections 
may occur in passing from a sequence of ordinary domains to a degenerate 
domain; this is made clear in the proof of Lemma 1, part (b). 

A simple examination shows that a unique point in & corresponds to a single 
domain in 92, degenerate or not; and to different domains in Jt, correspond 
different points in ©. The set of points in ©, corresponding to the whole of Jz 
is easily shown to form a closed 6-dimensional set extending to infinity (in the 
a-direction). 

Lemma 1. The one-to-one mapping of R2 onto a point set of Ey, given by (1), 
Jig. 2, ana (2), is a topologic mapping. 

Proor. It is required to show that the one-to-one correspondence preserves 
the limit relation. Let G" be a sequence of domains of Jt: and x” the corre- 
sponding points in G. Since p” — 1 implies zj — ~, and conversely, the 
sequence G” has no limit domain if and only if x” has no limit point. It remains 


< pansy 


hy Na NT ANG ME 
- ‘ 





204 MAX SHIFFMAN 


to establish that 2” — x whenever G" — G, where z is the point in © corre. 
sponding to the domain G. This is trivial if the domains G" do not degenerate 
as n — o. If the sequence G@" tends to degeneracy, the possibilities are 
(see §1): 

(a) p»— 0. By §1, the limit domain G is given in fig. 2(d), and z is therefore 
the origin of ©. The equations (2) show that x" — z. 

(b) Two of the points pj , qi , 7: converge to the same point A of the unit 
circle, while pz , gz , 7: converge to distinct points. Let B be the point on the 
unit circle diametrically opposite to A. The limit domain G is obtained by 
describing K", separating G? and Gz , and normalizing each of these regions. 
Hence G is of the type shown in fig. 2(b). The normalization of Gy is per- 
formed by inverting with respect to K", transforming and inverting so that 
Pi, 4G 571 go into the specified points 6; , 6 , 6;. The point c; in He is the limit 
of the resulting point B in G7. Since cross ratio is invariant under linear 
transformations, 


(B, pi ? un ? rl) = (cy »r,Nn, 1). 
By noting that 
B-qapm-n _pi-n 


(B, pi, 1,71) = oe ree ms and (0, pr, Qi, TL = — = 
g-i i — Gh Pi- 





it is easy to show that (B, pi , gi , 71) and (#, pi , gi , 71) approach the same 
value. Thus, 


n n n tan ia 
(oe, pi ,g1,71) > (4,21,9,71), Or ae” — ae 


(where a = O or z, ora = Oor ~). 

Gi is normalized by inverting with respect to C2, expanding, rotating and 
possibly reflecting. Hence (0, pz , gz , r2) approaches either (0, po, g2, 72) OF 
its conjugate complex, so that b"e'!”"! — be'!®!. Since clearly p” — p, an ex- 
amination of equations (2) shows that x” — z. 

(c) A similar argument applies if two of pz , gz , r¢ approach the same point 
A’ on C2, while pj , qi , ri approach distinct points. The limit domain G is of 
the type in fig. 2(a). 

(d) Similarly, if two of pr , gf , rf approach A, and two of pz , q2 , 7 ap- 
proach A’. The limit domain G is given in fig. 2(c). 

Like results are obtained if the G” are degenerate. 

In all cases therefore, G” — G implies that x" > x. The lemma is established. 

Thus, 32 is a metric space. The metric could be defined as follows: the dis- 
tance between two domains of Itz is the Euclidean distance between their corre- 
sponding points in @ . 


3. The Metric Space X, 


We return to the case of k boundaries and construct a metric for R.. The 
parameters which will be used for any domain G of 9; are the 9 parameters of 
the preceding section for each pair of circular boundaries of G. Accordingly, 








UNSTABLE MINIMAL SURFACES 205 


suppose first that G is an ordinary domain of ®; , and C, , C, (u < v) any pair 
of boundaries of G with the specified points p, , qu, 7, DP», 57%. Nine coordi- 
nates xt’, -+:,29 may be associated to this pair of boundaries. They are 
determined by equations (2) of §2, where the parameters a, a, b, 8, p, ¢ are 
given by 


ae‘** = (8, » Pur Qs Ty) 
(3) be*? = (s, » Pv, QW, "); 
|e"? = (Si, &, Pu» Dr). 


Here s, and s, are the two points which are mutually inverse with respect to 
both circles C’, and C, , and s, is on the side of C, opposite to C, ; also0 <a <7 
and e = +1. These parameters and coordinates are invariant under linear 
transformations and reflections. They are the invariant generalizations of the 
equations (1) of §2. 

When G is a degenerate domain, the six parameters a, a, b, B, p, ¢ for the pair 
C,, C, (u < v) of boundaries are defined as invariant generalizations of figs. 1, 
2(a)-2(d). In this generalization, s, and s, replace ~ or 0, and «¢ times an angle 
(where « = +1) replaces the angle. Thus, 

1) if C, and C, occur in the same region, the parameters are given by equa- 
tions (3) (see fig. 1); 

2) if C, and a point circle c, are in the same region, 


ae“* = (8, , Pus Mu, Tr), ) 
be*? — j@, Pv, Ww, ry), (see fig. 2(a)) 
, 
= l(s,, 8», Du,» Pr) |; 


where p, is any point on the circle ¢, ; 

3) if a point circle c, and a boundary circle C, are in the same region, the 
parameters are the invariant generalizations of fig. 2(b). (Cf. 2) above.) 

4) if point circles c, and c, are in the same region, the parameters are the 
generalizations of fig. 2(c). (Cf. 2) above.) 

5) if neither C,, nor any c¢, occurs in the same region as C, or any c¢,, then 
p = 0 (see fig. 2(d)). Since the domain is of genus zero exactly one of the 
above occurs for a given pair p, v. 

The parameters a, a, b, 8, p, ¢ for the pair C, , C, of boundaries of G determine 
the 9 coordinates 4’, --- , 2%” by equations (2). We therefore have, for any 
domain G in 9%, , a unique poini « in the 9k(k — 1)/2 Euclidean space € with 
the coordinates 2{’, --- , x9” for all wu, vy = 1, 2,---,k, and yw < vy. A simple 
discussion shows that to different domains in St; correspond different points in €. 

Lemma 2. The one-to-one correspondence given above of Ri onto a point set of € 
is a topologic correspondence. 

Proor. The lemma follows from Lemma 1 if one notes that essentially the 
process in §1 treats two particular boundaries C, , C, exactly as they would be 
treated if they were alone. 














206 MAX SHIFFMAN 


Thus, 9, is a metric space. Furthermore, the discussion in §1 yields the 
result that the set of points in © corresponding to the whole of ®t; is a closed 
set extending to infinity. The fact that the 6k — 6 dimensional open set (open 
in 9) of all ordinary domains is everywhere dense in {t;, completes the proof of 

THEOREM 1. SR, is a 6k — 6 dimensional metric space in which bounded sets are 
compact. 


4. The Connectivity Numbers of ft; 


The connectivity numbers of 9%; will be in the sense of Vietoris theory, con- 
sidering only cycles and homologies which lie in bounded subsets of 9. The 
Vietoris character of the cycles will be used in an essential way in the proof of 
the following theorem. 

THEOREM 2. The connectivity numbers, Ry, Ri, --:,Rai, +++ , of Re have the 
values: 


Ry = 1, R, =R, =--- =k, =--- =@, 


Proor. The essential idea of the proof is to deform the whole space ‘; into 
a single point, namely the domain consisting of k unit circles (which is the most 
degenerate domain). 

(a) Denote the invariant p corresponding to the boundaries C, , C, (u < ») 
by p”, and set o = minimum,,, p”. Designate the set of all domains of ; 
for which o S$ é6by F;. Weshall first deform §;, into Fs , where 6 is any positive 
number. 

The deformed domain G;, 0 < ¢ < 1, will be obtained from the domain G 
by contracting each circle, except the unit circle, by a certain factor. It is 
important in this that all the G’s be normalized the following way: in each region 
of G the order for determining which circle shall be the unit circle, the con- 
centric circle, etc. is C, or c, , C2 or @2,-+:,C,orc,. If G belongs to F;, set 
G: = G. Ifthe o for Gis >6, let n(< 1) be that contraction of the circles of G 
which takes G into a domain G; for which ¢ = 54; define G; as the domain ob- 
tained from G by contracting its circles by the factor (1 — t) + én. 

It is necessary to show that G; is continuous in G and ¢, i.e., Gm — G; whenever 
G" > G,t" >t. This is trivial if o” < 5; if o” > 6, it follows by noting that 
G" can degenerate further only by the coalescence of two of pj , g7 , 7; whereas 
contraction to form Gi» retains the relative position of these points. 

One easily notices by consulting equations (2) that the above deformation 
deforms any bounded set B of 9, over a bounded set B’ into F; ; i.e., for a given 
bounded set B there is another bounded set B’ independent of 6 such that B is 
deformed within B’ into F;. 

(b) Let 2” be any Vietoris m-cycle on a bounded set B. Because of (a), 
z”™ — w3', where w; is a Vietoris m-cycle on B’-F; and the homology takes 
place over B’. Since B’-F; is arbitrarily close, for 6 sufficiently small, to the 
closed compact set B’- Fo, it follows from a basic theorem for Vietoris cycles 
that z” — wt , where w¢ is a Vietoris cycle on B’-Fy ° 





° A proof of an essentially equivalent theorem is contained in p. 20-23, Theorem 5.1, 
of [6]. 





Va 


its 
ha 
Wi 
lin 


on 
Ca 








i i A | 


UNSTABLE MINIMAL SURFACES 207 


(c) Fo consists of the domains G for which at least one of p”” is zero; set o’ = 
minimum of the remaining p””. Designate the set of all domains of Fy for which 
o' < 6 by Fos. By contracting circles, as in (a), Fy can be deformed into Fo,3 
for any positive 6. As in (b), 2” — wo ~ who where wyo is a Vietoris m-cycle 
on B”- Fo. 

Continue with Fo, as with Fo, etc. One finally obtains z” ~ w™, where w™ 
is a Vietoris m-cycle on the set F consisting of those domains G of 9, for which 
all the p”” are zero. But F contains a single point, namely the domain con- 
sisting of k unit circles, bounded by C, , C2, --- , C, respectively, with the points 
Pus Gu» Tm at the specified places 6, 6, 63; of C,. The theorem follows 
immediately. 


5. The Space G 


We are now prepared to discuss the space $ of potential surfaces bounded by 
T,,:::,I%. An element of $ is a potential surface r(u, v) defined over a 
domain G of 9, and satisfying the following conditions: r(u, v) maps each C, 
continuously and monotonically onto T,, and p,, gq, 7. into the prescribed 
points P, , Q,, R, of T, ; r(u, v) takes constant values at each point circle c, , 
equal to its value at the corresponding point c, on C, ; and Dg[r] is finite, where 


D,{x] is the sum of the Dirichlet integrals of r, 5 / / (x, + x°) du dv, over the 


various regions which compose G. 

A surface rx in $ is determined by the domain G over which it is defined and 
its boundary values on each of the k circles C,. Since the points p,, gq, T, 
have already been considered part of the domain G, the boundary values of r 
will be defined in the following way. Map the circle C, onto a unit circle by a 
linear transformation or reflection so that py , du , 7u go into 4; , A , 6s respectively. 
The resulting values of r on this circle will be called the boundary values of r 
on C, and denoted by r,(6). The distance between two elements r and y of $ 
can now be defined by 


k 
|e — | = |@— A| + 2 maximum | z,(9) — »,() | 
p=l1 050527 
where G and H are the domains of r and y, and | G — H | is the distance between 
G, Hin KR. 

The subspace of all elements r of $ for which D[r] < N will be denoted by 
$x. To avoid confusion with the case of a single boundary I, the latter space 
a be designated by §"; the complete notation for the present space $ is then 

Mec 

It will be convenient to use the following notation. A potential surface in $ 
is to be designated by r or r(u, v), its boundary values in the above sense by 
t,(9), and the potential surface defined over the unit circle with the boundary 
values r,(0) by z,. The part of the plane interior or exterior to C,, , according 
as G is interior or exterior to C, , is denoted by G, ; the potential surface defined 





deka ae sh ee 





208 MAX SHIFFMAN 


over G, , with the same values on C, as x(u, v), by f, (thus f, is the linear trans- 
form of rx, back to G,). 

TuErorEM 3. $B = KR, X PB" X Bre x -:: xX B"*, where -X- signifies the 
topologic product.” 

Proor. Let r be any potential surface with domain G and boundary values 
1,(0). Let K, bea circle in G concentric to C, and near it; let A, be the annular 
ring between C,, and K, ; and set G’ = G — i A, . We howe 


Delt] = > D,,{t] + Delt). 


Obviously Dg [r] < 2%, so that De[r] < © if and'only if D,[z] < for every 
uw. In A,, set x’ = x —F,. Since x’ has bounded derivatives near K, and is 
zero on C,, Du,[t’] < «©. Hence, by the triangle inequality, D.,[r] < © if 
and only if D, iia < ©; and D,[t,] < © if and only if De, [Z,] < 0. The 
theorem follows by noting that Bas [Z,] < © is equivalent to: x, lies in $"*. 

THEOREM 4. Sy is compact and closed for every N. 

Proor. Let r"(u, v), with domains G", be a sequence of surfaces in Py. 
There is a subsequence of the G"’s converging to G*. Transform each domain 
G” linearly so that a given C, is the unit circle and the points p, , q, , 7, are at 
the specified places @,, 62, @;. As in [1], the boundary values r/(0) are equi- 
continuous. A final subsequence, written r”(u, v), is obtained such that r" — r*. 
By referring to the geometric definition of convergence given in §1, it follows 
that the Dirichlet functional is lower semicontinuous. Hence D[r*] < N, and 
the theorem is proved. 


6. The Continuity of the Functional Z[r] 


Let rx be a potential surface with continuous boundary values r,(6), and indi- 
cate the Dirichlet integral of x, taken over the unit circle by Do[r,]. Define the 
the functional E[r] by 


E\x] = D{r] — py Dolt,I. 


In this section, we shall prove that E[r] is continuous in %. This will reduce 
the behavior of D[r] to that of Dofx,]. 

Let G be the domain of the potential surface r, and G, that region of the 
plane which is bounded by C, and intersects G. Draw a circle K, in G suffi- 
ciently near C,, » = 1,---,k. Let G be the subdomain of G bounded by 
the circles K, and any point circles c, , and G, that subdomain of G, bounded 
by the circle K,. Finally, let =, be the potential surface defined over G, and 
having the boundary values of r on C, ; x, is the linear transform of f, to a 
unit circle. 





* Furthermore, for any number N there is a compact subset R of ™ and a number M, 
while for any R oul M there is an N’, such that By C RX eu x ++» xX Blk Cy. This 
follows from Theorem 5 to be proved below. i 








3~ 


S 


iT 


UNSTABLE MINIMAL SURFACES 209 


We have 


E(t] = Dir) - y Del&.] = lim {Dele] - > De;lé,]}; 


uCy 


and 


Delt] — 2 Dail.) = 2 fee i= ath J. e5 rt = ds — x I. 5, ody 


— OT, Mu 
-Lf @-wRaet+ Ef Moher Dl kas 
wn “Ky n au ‘" 


If K, — C,, the first integral on the right hand side approaches zero since 
r—t =OonC,. Hence 


0 
(4) mel= df r= Has + Df esas 
oa 
This is an expression for E[r] ‘ terms of Sate integrals. 

TurorEM 5. Let G" be a sequence of domains in Rx converging to G, and r” 
potential surfaces over G" converging uniformly’ to the potential surface x over G. 
Then 


E{r"] > Ef). 


Proor. First consider the case when the G" are ordinary domains in 9; . 
If the limit domain G is also ordinary, the theorem follows from (4). For, 
t"—z and r" — f > x — Ef, uniformly nearC,. Since r" — fi = OonC;, 
it can be extended by reflection across C{ and d(x" — £/)/dn converges uni- 
formly to 0(r — f,)/6n on C,. The second term on the right hand side of (4) 
is not involved since we have assumed that G” and G are ordinary domains. 

Suppose that the limit domain G is degenerate. The continuity of E[r] will 
be established by induction on the number k of boundaries. We shall distin- 
guish two cases: a) the origin is enclosed by a large circle or is approached by a 
large circle; and b) all the circles approaching the origin are small circles. 

(a) Enclose this large circle and the set of circles approaching this large circle 
by a curve K containing no other circles. Let Gi be the domain bounded by 
all the circles outside K, and Gz the domain bounded by all the circles inside 
K (Gy is a bounded domain, Gf unbounded). Let the limits of G", Gi, G2, 
considered merely as regions of the plane, be G*, Gi , Gz respectively. Define 
y" and 3" over the domains G? and G? respectively as the bounded potential 
surfaces with boundary values equal to those of r”. Denote the limits of r", 
y", 3" by r*, y*, 3*; they are potential surfaces over G™, Gt , Gz. We shall first 
show that De»[r"] — Der[y"] — Dals"] — De-[r*] — Des [y*] — Des[3*]. 

Designate the parts of G@” and G* exterior to K by A" and A*, and interior 





’ This means that the boundary values of r” in the sense of §5, page 207 converge uni- 
formly to the corresponding boundary values of r. 


- eee 


ee ee 





~ 
ee ee eee 


ae 





210 MAX SHIFFMAN 


to K by B" and B*. Let Kin and K.x be the interior and exterior of K. From 
G" = A" + B",G] = A” + Kin, Ge = B" + Kex, one obtains 
Delt"] — Desly"] — Desla"] = {Daelt"] — Daly"]} — Delo’ 
+ {Dzn[z”] — Dsal3"]} — Dr.,[3"); 
and a similar decomposition for Dg[r*] — Dos[y*] — De3[3*]. Now, 
Daslt"] — Danly"] = Dale” — 9", t” + 9") 
= [ (” — y") == Wi [ (r* — y*) a S 0) as 
= Dalt*] — Daly*); 
Drx,,\y"] > Dx,,{y*]; and like results for Dg»[r"] — Dzn[;"] and Dr,,{3"]. Hence 
(5) De{x"] — Derly"] — De{3"] > De[t*] — Dezly*] — De;\3*). 


The domains Gf and Gz have k; and ke, boundaries respectively, where 
ky + ke = k, and the theorem is supposed true for domains with less than k 
boundaries. Let G", Gi , Gz when considered as domains of Rx , Rx, , Re, have 
G, Gi, Gz as their limit domains respectively. G, consists of Gf and other 
regions H, ; Gz consists of G? and other regions Hq) ; and G consists of G* and 
the regions Hq and Hy. Indicate the boundary curves of Gf by C,,, and 
those of Gz by C,,. By the induction, 


(6) Daly") ae 7 Dolxi, | <> Dasly*] + Dz,,)(t] — > Dd tu,] 


and 
(7) Derl3") — > Dolxp,| > Desl3*] + Du..lt] — D2 Doldys). 


Adding (5), (6) and (7), one obtains the desired result 
Delt"] — 22 Delta] > Delt*] + Dayylt] + Daylt] — 0 Dole 
B B 


(8) 
= Dlr] — » DAltzl. 
B 

(b) In this case, when all the circles approaching the origin are small circles, 
the procedure is the same as the above. The only modification is that the 
region G2 is the whole plane, 3* is identically constant, and y»* = r*. The 
relation (5) reads Den[r"] — Der[y"] — Dez[z"] — 0, and the final result (8) is 
the same. 

There remains the case when the G” are themselves degenerate. One may 
suppose, without loss of generality, that all the G" have the same type of de- 
generacy: G" consists of the distinct regions G{, --- , G? where the G) have 
the same boundary and point circles for all n. The above proof then applies 
to each sequence G; separately. The theorem is completely established. 








rom 


'y*); 


once 


1ere 
nk 
ave 
her 
and 
and 


r, |. 


es, 
the 
‘he 
| is 


ay 
le- 
ve 
ies 





UNSTABLE MINIMAL SURFACES 211 


Theorem 5 shows that the discontinuities of the functional D[r] are due to 
the boundary values of r and not to the domains, and occur in the same manner 
as each Dy[z,]. It should be noted that E[r] can be so defined, by (4), so as to 
exist even if D[z] is infinite. Theorem 5 will still apply. 

If x and y are potential surfaces having the same domain G, define the cross 
E-functional E[x, »] by 


E(x, y) = Dir, y] ry, p> Dit, , y,I. 


The bilinear formula applies: 
E{z + y] = Elz] + 2E[r, »] + Ely). 


Consequently theorem 5 has as a corollary 

TurorEM 6. Let x", y” be two potential surfaces both defined over the domain 
G" of N., wheren = 1, 2,---. Let the sequences x", )” converge uniformly’ to 
the potential surfaces x, ) defined over the domain G. Then 


E{r", y"] — E[r, 9). 
Part II. APPLICATION TO UNSTABLE MINIMAL SURFACES 


7. Linear Paths of Surfaces 


To obtain the Morse relations for minimal surfaces bounded by [I,, 
T.,-::, I, it is necessary to establish a variational condition and a reducibility 
condition. The latter requires discussing special paths of surfaces in . 

Let us suppose that each I, is rectifiable, and select the representation g,(9), 
0 < 6 S 2z, of T, in which the parameter @ is proportional to the are length 
on I’, measured from the point P, and in the direction P,Q,R,. Each g,(@) 
has the property 


(9) g.(0’) — g,(0’’) 
 — @” 


Ed 


2r 


IA 





where L is the largest of the lengths of T:,---,I. Any other representation 
t,(9) of T, is given by 


tu(8) via Gu(Au(9)) 


where \,(@) is a monotonic and continuous function of @. Call A,(@) the mono- 
tonic function determined by r,(6). 

Let Yo and x; be any two potential surfaces in having the same domain G, 
and with the boundary values (in the sense defined in §5, part I) r,(6; 0) and 
t.(8; 1) respectively. Define x(t), 0 < ¢ S 1, as the potential surface with 
the same domain G and boundary values r,(0; ¢) given by 


(10) 1u(9; t) = gul(1 — é)d,(0; 0) + &,(9; 1)] 


Pa o> 





212 MAX SHIFFMAN 


where \,(0; 0) and \,(@; 1) are the monotonic functions determined by 1 and 
ti respectively: 


(11) tu(0;0) = guldu(9;0)], — tu(8; 1) = guldAy(9; 1)]. 


The potential surfaces x(é), 0 S ¢ S 1, form a “linear’’ path joining x and y,. 
Lemma 3. For the linear ae r(t) constructed above, 


Ee) ls 2 ——— | r.(0; 1) — A,(8; 0) |. 


x(t’) — x(t’) 
fv — 7? 


Pacer. has boundary values on C, equal to 


giv’) — ouy’”) “i gulv’) — gly”) y’ — p” 
t’ = "ie y’ ade y”’ af eet od 
where y/ = (1 — t’)A,(0; 0) + tA, (0; 1), and y” is a similar expression involving 
t’’. Using (9) one obtains 








ba os ae < x10; 1) — A,(9; 0) |. 


of HC) = 200”) 


oa on any point circle c,. The 


Similarly for the boundary values 


lemma follows. 

TurorEM 7. Let rp and x; , both having the domain G", n = 1, 2, --- , be two 
sequences of potential surfaces in $3 converging to the same potential surface t. 
Let xr"(t) be the linear path joining x5 and x; . Then 


Ele"(t)| — Ele") _ 
one 





uniformly in t’ and t’’. 
Proor. The desired difference quotient can be written in the form 
Ble") — Ele") _ Ele") — 2°"), °C) + OU) 
Oe tf’ — t” 


es ~ 3/2 - Riv i. r"(t’) + re"). 





(12) 





Now the monotonic functions ;'(6; 0) and \/(6; 1) determined by xo and ri 
both converge uniformly to the monotonic function determined by r. Conse- 
"w) - 2") 
"i saan! "id 
converges to the potential surface identically equal to zero, uniformly in ¢ 
and ¢’’. Theorem 6 of Part I applied to the last expression in (12) shows that 
the desired difference quotient converges to E[0, 2x] = 0, uniformly in ¢’ and ¢”. 





quently r”(¢) converges to xr, uniformly in t; and by Lemma 3 u 








and 


il to 


‘ing 


The 


two 
iz 


n 


Se- 
t’’) 


at 
r". 


UNSTABLE MINIMAL SURFACES 213 


8. The Reducibility Condition 


We shall obtain reducibility on the assumption that reducibility holds for the 


case of single boundaries. Such results for single boundaries have been derived 
in rather general cases by the author, and by Morse and Tompkins. 

Suppose that the following result has been established for each of the curves 
r,,# = 1,2, » 

Reducibility Condition for Single Boundaries.* Let y be any surface in $B". 
To each surface x of $B" there can be associated a linear path x(t) joining x tox’ = x(1), 
where x’ depends continuously on x, and x’ = yifr =». For any n° and y there 
is an a independent of y and a 6 such that, if x is in the 6-neighborhood of y, this 
linear path x(t) has the following properties: 

1). D[r(t)] exists and depends continuously on tin 0 St S 1. 

2). D[x(t’)] > Diy] + 1 and D[x(t’’)] > Diy] + 1 imply that 


AD _ D{e(t’)] — Dix@’)) — 
At —— "i — 





AD 
3). tt 


In case 


< y for any t’, t”’. 


a exists for 0 < ¢ < 1, Properties 2) and 3) are equivalent to: 


dD{x(e’)] 
dt as 





2’). D[x(t’)]} > Dly] + » implies that 
dD{x(t)] 
dt 








3’). Syfor0<it<l. 


There is an important consequence of Property 2). 
Lemma 4. Jf D[r(t’)] S D[y] + 7 for some t’, then D[r(t)] S Dy] + 7 for all 
t2t'. If D[x(t’)] > Dly] + 7 for some t’”’, then D[x(t)] > Dy] + n for allt s 0’. 
Proor. Let the maximum of D[r(t)] for t 2 ¢t’ be attained for ¢ = +r. If 
D{t(r)| > Dy] + n, choose r’ between ¢’ and 7 and so near 7 that D[r(r’)] > 
Diy] + 7. Then D{x(r)] — Die(r')] 
7 = 7’ 
Dir(z)] < Dy] + , and the first statement of the lemma is proved. The 
second statement of the lemma is a consequence of the first statement. 
Therefore, the possibilities for the graph of D[r(t)] as a function of tin0 St <1 
are: 
a) If D[r] S D[y] + 7, then D[r(é)] is always at most D[y] + 7. 
b) If D[x] > D[y] + », then D[r(t)] changes at a rate < — ew until Dfy] + 
is reached (if it is reached at all) and thereafter D[r(t)] remains below D{y] + 7. 
This is exactly what is meant by ‘reducibility’. Compare page 36 of [6], Lemma 
9 and Theorem 4 of [12], and page 445 of [7]. 
We now return to the case of several boundaries. 


= 0, contrary to Property 2). Hence 








* The reducibility condition stated here can be easily qrapralised but it is useless to do so. 
* All the quantities 7, y, a, 6 are positive. 





ee ne DIK 


Bieta oon 


















214 MAX SHIFFMAN 


TurorEM 8. Suppose that the reducibility condition above has been established 
for the individual rectifiable curves T;, T2,-+:, Tx. Then the same condition 
holds for the case of the k boundaries T,, T2,--- , Ux together, 2.e., in the space 
i vtaee Fa. 

Proor. Let » be a fixed surface in $ = ¥ "* with boundary values 
y, , and let r with domain G and boundary values r, be any surface in B. To 
the potential surface r, with boundary I, there is associated the linear path 
r,(t) determined by our hypothesis for T,. The required linear path associated 


T1,Ta,-+, 


to xr is defined over the domain G and has the boundary values z,(¢), 4 = 1, --- ,k. 
We have 
& 
(13) Dix(t)] = > Ddz,(t)] + Elx(d)], 
a 


which shows that D[r(t)] is a continuous function of ¢ because of Property 1) of 
our hypothesis and of Theorem 5 in Part I. 

Let » be arbitrary. Our hypothesis for I’, determines a constant a, belong- 
ing to 7/2k in place of 7. Set a = minimum aq, for uw» = 1,---,k. Choose 
any y S a/2k. There is a 6, such that all the assertions of our hypothesis 
(with »/2k replacing 7) apply if rz, is in the 6,-neighborhood of y,. Set 6’ = 
minimum 6, for w = 1, --: , k. 

If x is close to y then x(t) is likewise close to y (since r,(¢) is close to y, for each 
nu). By Theorem 5 in Part I, there is a 6” such that 


| Ele] — Ely) | < 0/2 


whenever ¢ is in the 6’’-neighborhood of y». By Theorem 7, there is a 6’”” such 
that 


(14) 








t’ pore ¥" 


if x is in the 6’’”’-neighborhood of ». 

The required 6 is defined by 6 = minimum 4’, 6’, 6’’’.. Consider only sur- 
faces x which are in the 6-neighborhood of y. 

Suppose that D[x(t’)] > D[y] + and D[r(t’’)] > Dy] + 7», where t” < ?. 
The first inequality requires that Dy[z,(t’)] > Dolv,] + n/2k for some p = », 
by (13) and (14). Lemma 4 yields D,[x,(t’’)] > Dly,] + /2k. Properties 2) 
and 3) of our hypotheses give 


ADvlx.] i Doz.(t’) a Dlzt(t’’)] . om 
At ee ai 








(16) 


and 


AD, At,] 
At 


(17) 


< y S a/2k for ally. 


























hed 
On 
uce 


1es 
To 
th 
ed 


ch 





UNSTABLE MINIMAL SURFACES 215 


A final calculation, using (15), (16), (17), shows that 
ADIx] _ Die(e’)] — Die") _ y ADI] , AzIz] 








_ t’ — t”’ — At 
oo © ei we we 
” at+Lats 2° 


This proves Property 2). 

Property 3) likewise holds, by (15) and (17). But this is of no importance 
in connection with reducibility. q.e.d. 

For single boundaries reducibility has been proved by the author, by Morse 
and Tompkins, and by Courant for special classes of boundary curves. We 
shall discuss each briefly. 

The author in [12] used the following type of path joining two boundary 
representations 1o(@) and m4(@) of the curve T. Let yo(do(@)) = g(@) and 
n(A(8)) = g(@) where Xo(@) and ),(@) are monotonic functions of @ and g(@) is 
that representation of T the parameter @ of which is proportional to the arc 
length. Define r(y; t) by 


r(y; t) = (4) where og = (1 — é)ro(O) + Ax(8). 


Compare with the linear path (10), (11) used in the present paper. For this 
path a property such as Theorem 7 is very difficult to prove. Accordingly, the 
results of [12] cannot be used here. 

Morse and Tompkins in [7] proved the reducibility condition stated above in 
case T has the following property: 


(18) q’(@) existsand |9’(@:) — g’/(@)| << M|A— &| 
for all @,, 0. See Lemma 5.1, Lemma 5.4 and Theorem 5.1 in [7]. The hy- 
pothesis of Theorem 9 is consequently satisfied if each T,, » = 1, --- , k, has 


this property. 

The above reducibility condition is implicitly contained in the work of Courant 
[3] for polygonal boundaries. We shall not stress this point however, since 
Courant’s method is directly applicable without the intervention of §§7,8. We 
consider this matter in the next section. 


9. The Case of Polygonal Boundaries 


We shall show how to apply the method of Courant [3] to the case of several 
polygonal boundaries, using only the results of Part I. 

Let the vertices of I, be P, , Q,, Ru, Su, Tu, °°: ete. For a given surface 
t(u, v) of 8 denote points on C, which are mapped into P, , Q,, Ry, S,, --+ by 


Pusu», S.,***. Map the circle C, by a linear transformation into a unit 
cirele so that the images of p, , gq, , 7, are the specified points 4, 62, 6;. The 
images of s, , t, , --- on the unit circle will be denoted collectively by o, ; and 


%1,02,**+ , o; will be indicated collectively by co. 





ee ee 


216 MAX SHIFFMAN 


Let Q denote the space of elements {G, ¢}, of domains G and points oc, with 
an obvious metric. The dimension of Q is 3k — 6 + V where V is the total 
number of vertices in the polygons T,,---, IT. OQ has the same connectivity 
numbers as MR; . 

The space of potential surfaces r(u, v) of $ defined over elements {G, o} of Q 
will be indicated by $8’. B’ has the same connectivity numbers as §. 

For a given {G, o} of Q consider the problem of minimizing D[{r] among all 
surfaces r of 8’ defined over this {G, ¢}. By the compactness of $y (and of 
%v), this problem has a solution x. As in [8], using the linear path x(t) = 
(1 — é)xo + ta joining two surfaces xo and yr having the same {G, c}, the solu- 
tion is unique. Denote it by x(G, c), and set d(G, «) = D[x(G, o)]. 

THEOREM 9. The surface r(G, o) and the quantity d(G, «) depend continuously 
on {G, oc}. 

Proor. Let {G", o"} — {G,o}. Abbreviate x(G, o) by rz, and let rz, be the 
potential surface defined over a unit circle with the boundary values deter- 
mined by ron C,. Vary the points o, and the surface rz, as in [3] so that the 
o, move into o, . The varied potential surface y,; is such that 


(19) y >t and Diyil-Diry] as n> o. 


Let »” be the potential surface defined over {G", o”} with boundary values 
on C, determined by yy, , » = 1,---,k. We have 


(20) Diy") = > Dal] + Ely"] > + Dalt,] + Elz] = Die] 


by (19) and the continuity of the functional E[r] (Theorem 5 of Part I). Be- 
cause D[y"] = d(G", o"), (20) yields 


(21) d(G, «) = lim. sup. d(G", o”). 
In particular the quantities d(@", o”) are uniformly bounded. 

On the other hand, by the compactness of $y , a subsequence of the surfaces 
1(G", o”) converges to a potential surface r~ defined over {G, co}. The lower 
semicontinuity of the Dirichlet functional gives 
(22) d(G, ) < Di[r*] S lim. inf. d(G", o”). 

The relations (21), (22) establish the continuity of d(G, c). 

The equality sign holds throughout (22), so that D[r*] = d(G, c) or r° = 
r(G, 0). Thus r(G@", o”) — x(G, o), and the theorem is proved. 

This theorem shows first that the set of surfaces r(G, o) is topologically equiv- 
alent to the space Q. The symbol © will henceforth designate the space of 
these surfaces x(G, 7). The theorem then asserts that D[r] is continuous over Q. 

So far as Morse theory is concerned, our discussion may be limited to the 
space Q. 








vith 
otal 
vity 


f 


all 
l of 


ylu- 
isly 
the 


er 
the 


ues 


22S 
ver 





UNSTABLE MINIMAL SURFACES 217 


10. The Variational Condition 


In this section, no restriction will be made on the contours T,,---, Ty. 

A surface r in $ is a minimal surface if the analytic function g(w) = 
(t. - ir, =r - rt, — 2ir,2, is identically zero in each of the regions over 
which x is defined. The purpose of this section is to show, when y is not a mini- 
mal surface, that a neighborhood of y can be deformed so as'to decrease the value 
of the Dirichlet functional. If is an ordinary surface in §, i.e., not degenerate, 
then this result is an immediate consequence of variations performed by Courant. 
The difficulty arises when y is degenerate; for this case a more detailed investi- 
gation is necessary. 

Let » with domain G, be a surface in {3 not a minimal surface. We may sup- 
pose without loss of generality that —2y,y, is not identically zero in at least 
one of the regions G’ which compose G."° Let a be any value = Df]. Let r, 
with domain H, indicate a surface in $3, near y, and let H’ be that region of H 
corresponding to G’ and normalized analogously to G’. The following lemma 
is a result of performing certain variations due to Courant (see [2]): 

Lemma 5. Let x, with domain H, be a surface in $B, , and suppose that —2rut, S 
—b < 0 ina fixed square K (with sides parallel to the u and v axes) of side 2r 
interior to the region H’ of H. Then a deformation r(e),0 S € S 1, of x can be 
obtained such that 


D{r(¢)] = Diz] — Be, 


where 8 depends on a, b, and r. This deformation r(e) is a continuous function 
of «, and of x as long as no boundaries of H’ appear in the square K. 
Proor. Deform the region H’ into H’(e) by the transformation 





(23) iy Ler A(u, v), 
where 
[ACu, v) =0 outside the square K, 
(24) —_— 
Incw, 9) = [rn — —_ (v — %)] inside K. 
T 


Here, (wo , vo) is the center of the lower side L of the square K and0 S « S 
1/(4r). The function d(u, v) is non-negative, is discontinuous across the side 
L, and has its first derivatives bounded in absolute value by 27. By coordinat- 
ing the point u of the lower edge of L with the point u + «A(u, vo) of the upper 
edge of L, the transformed domain becomes a Riemann domain. Define r(e) over 
the Riemann domain H’(e) as the potential surface having the same boundary 





‘If —2r.2. = 0 but r2 — x3 # 0, rotate the (uw, v) coordinate system 45° and consider 
the new coordinates u’, v’. We have u’ = (u + v)/+/2, v’ = (u — v)/V2 and a — a= 
2 Tuto’. 


~ ee 


EN Eee 





218 MAX SHIFFMAN 


values as r on each boundary. One may map H’(e) conformally on a circu- 
lar domain normalized analogously to H’, and consider r(e) as a potential sur- 
face over this circular domain.” In all the other regions besides H’ which com- 
pose H, set r(e) = x. It can be proved by use of conformal mapping methods 
that r(e) depends continuously on e, and on r as long as no boundaries of H’ ap- 
pear inside the square K. 

To compute D[r(e)], introduce the surface 3(¢) defined over the region H’(e) 
by 3(U, V; «) = r(u, v) where U, V are given in (23). Then r(e) has the same 
boundary values as 3(¢€) in H’(e), and we have by a simple calculation 


Die(e)] < Diyle)] = Diel + § f° vCu, (2a tedonng due + 1 


uUg-T 


where | J | < 47°D[r] S 47°a. Since —2r.r, < —b in the square K, 


ugtr ugtr 3 
/ Au, vo)(— 2tu Siu du < —b / [7 = (u - uo)'] du = -*” ; 
“o-t ug—T 


For 0 S e S o’, where o’ is the smaller of the two numbers 1/(47), rb/(12a) we 
obtain 


4r*b 


ugt+r 
[re v0) 2a teen du + Is —P + = Anta 


1 4r'6 
2 ug-T 3 
Hence, in 0 S ¢ S o’, D[r(6€)] < Dr] — 7r*be. Replacing e by o’e, the lemma is 


obtained with 6 = 7’bo’. 

Lemma 5 will now be used to perform a piecewise deformation of a neighbor- 
hood of y in 2,. There is an open region in G’ in which —2y,y, S —2b < 0. 
If y is ordinary, a square K can be selected in this open region, and Lemma 5 
immediately yields the required deformation. If is degenerate, Lemma 5 does 
not apply since a surface near ) may have small circular boundaries in K; but 
it must have less than k small boundaries. Accordingly, in the region where 
—2y.Y. S —2b, select k squares Ki, Kz, --- , Kx of sufficiently small side 27 
with sides parallel to the wu and v axes, and at a distance = 167 from each other 
and from the boundaries of G’. Consider the 26-neighborhood of » such that, 
for any r(with domain H) in this neighborhood: 

1) those boundaries of H’ which correspond to the boundaries of G’ are dis- 
placed from them by at most a distance 7, and the other boundaries of H’ (these 
will hereafter be called small boundaries) have a radius < 7; 

2) if there are no boundary points of H’ within a 27 neighborhood of the 
center of the square K; , then —2r,r, < —b throughout this K;. The desired 





™ Compare [2]. The end points of L are actually transformed into points (and not 
merely slits). Of course it is possible to obtain a similar lemma and the variational condi- 
tion by using variations not depending on the theory of conformal mapping, as in [1], [2]. 
But it would then be necessary to distinguish two cases: varying boundary values and 
varying the domain. 








ircu- 
sur- 
0om- 
hods 
ap- 


I'(e) 
ame 


| we 


a, is 
or- 


a 5 
oes 
ut 
ere 


1er 
at, 


is- 
Se 


he 
ed 


10t 
di- 
2]. 
nd 





UNSTABLE MINIMAL SURFACES 219 


neighborhood of » in $_ which will be deformed is the closed 6-neighborhood 
M, = Nz in Bu - 

The piecewise deformation of M, will be obtained by applying Lemma 5 suc- 
cessively for each square K;. For this purpose, limit ¢ to the range 0 < ¢ < o 
where o is so small that: 

3) the surfaces r(e) of Lemma 5 for each square K;; is displaced from r at 
most a distance 6/k; 

4) the boundaries of H’(e) are displaced from the corresponding boundaries 

of H’ at most a distance 7/k. 
It follows from these conditions that any surface 3 obtained by a k-fold repeated 
application of all these deformations, satisfies 3), 4) with 6, 7 replacing 6/k, r/k. 
Because of this, conditions 1), 2) likewise apply to 3. Replace e by ce, and of 
by 6 so that Lemma 5 and the above apply for 0 S « < 1. 

Let A; be that subset of M, consisting of those surfaces r interior to M, 
whose region H’ contains boundaries at a distance < 37 from the center of Ki . 
Indicate the closure of Ai by Ai, and set B; = M, — A,. If ris any surface 
in A; , it remains fixed in the deformation; if r is in B, , construct the deforma- 
tion r(e) of Lemma 5 for the square K,. Indicate the boundary operator by 
%; (boundary as subset of %.), the deformation operator by D; , and the final 
image x(1) by ®1. Then 


~ B, — B, al D, Bi B; — $B, 
or B, “ DB: B, + $1 Bi. 


These relations and the operators 8; , D:, §1 are understood in the sense that 
they apply in a natural way to any Vietoris chain on B;. In view of Lemma 5 
all the surfaces involved in (25) lie on %, , and $B; lies on Pa_s ; the homology’ 
and all subsequent homologies, take place over %.. It follows from (25) that’ 


(M, — A; + DB: Bi + 1B; = M2, 

$a (Ba — Mi) + M2 = P». 

The surfaces in Mz may be designated by f(r, ¢) in place of r(e), where 
0<t <1 if r belongs to B,B,, t = O if x belongs to A1, and ¢ = 1 if ris in 
the interior of B;. The surfaces in P2 consist of M2 and f(r, 0) where r belongs 
to $. — M,. The surface r, from which f(r, ¢) was obtained, is called the 
pre-image of the surface f(r, t). The set P2 may be considered as a new space 
in which distance between f(11 , t:) and f(r» , t2) is defined by | 1 — 2 |+|4—&], 
where | tr; — z2| is the distance between 11, Y2 in Bu. 

Suppose that M; and P; (j S k) have been constructed, and consist of sur- 
faces in B. designated by f(z, t:, «+ , t;-1) where 0 < ¢, S$ lort, = Oort, = 1 
according to the situation of r in certain subsets of B.. Let A; be the subset 


(25) 


(26) 





_ "Any Vietoris chain in Mj is homologous by subdivision to a Vietoris chain part of which 
lies wholly in A, and the other part wholly in B, . The homology M; ~ Mz is a consequence. 
Similarly for B, ~ Ps. 





eS la & 
a 


220 MAX SHIFFMAN 


of M; consisting of all the surfaces f(r, 4 , --- , ¢;-1) in M; where the region H’ 
for x contains boundaries at a distance < 37 from the center of the square K;; ; 
set B; = M; — A;. Iff(z,t, «+: , t;-1) is any surface in A; , it remains fixed. 
If f(z, 4, --: , t-1) is in B;, deform f(r, t, --- , t;-1), considered as a surface 
in %, , according to Lemma 5 for the square K; ; Lemma 5 applies in view of 
conditions 3), 4) and 1), 2) above. Indicate the deformed surface by 
f(t, ti, +++ ,t;-1,¢t;)) whereO S$ ¢; S 1. As in (25), (26), 

M; ~~ A; + 9; 8;B; + iB; = Min, 
(27) eS 

P;— (Ba — Mi) + Mins = Pins, 


where closure A; or B; , boundary %; and image §, are taken as subsets of P;. 
All the relations (27) for j = 1, 2,--- ,k yield 

M,- M k+1 
(28) 


Ba baci Pry. 


It is clear from Lemma 5 that if 3 = f(r, h, &, +--+ ,&) is any surface in 
Meas ; then 


(29) Diz] < Dix] — Bd &: 


Furthermore, we have 

Lemma 6. Let x be any surface in the interior of M; , and3 = f(t,ti,4,°-: , t&) 
any surface in My4;1 with x as pre-image. Then t, = 1 for at least one t, of 
‘hh. ***,&. 

Proor. If the lemma were false, there would exist a 3 = f(x, i, 2, --+,t) 
in M;,.1 for which rg is interior to M, and t, ¥ 1,4 = 1,2,---,k. Sincet, # 1, 
it follows that the surface f(r, ti, --- , 1) belongs to A,. Hence there are 
surfaces f(r’, ti,---, 6) belonging to A, which are arbitrarily near 
f(z, tr, «++ , te); 2’, being near yg, is still interior to M; and t{ gt a t.1 are ¥ 1. 
In particular, arbitrarily near the region H’ for x are regions H; for r’ which 
contain boundaries at a distance < 37 from the center of K;. . 

Since t,1 ¥ 1, the surface Ke ,tt , +++, 2) belongs to A,1. As previously, 
there are surfaces f(r”, ti, --+ , t,2) arbitrarily near f(t’, t, +++, t-2) which 
belong to Ay. In silat, siimally near H; are regions H; (for r’’) which 
contain boundaries at a distance < 37 from the center of K;,_1. Because H, 
has a similar property for the square K; , it follows that Hs contains boundaries 
at a distance < 37 from the centers of K,_1 and of K,. Continuing in this way, 
one finally obtains regions H; (which are regions for surfaces interior to M1) 
with boundaries at a distance < 37 from each of the centers of K, , K2, --- , Kx- 
But this contradicts the facts that H; contains less than k small ais each 








an 
K; ; 
xed. 
face 
WV of 


» in 


tk) 
of 


ts) 


Are 
par 


ich 


ly ’ 
ich 
ch 


ies 
Ly, 
[;) 


— 


ch 


UNSTABLE MINIMAL SURFACES 221 


small circle has a radius S 7 (see 1), 2) above), and the squares Ki , Kz, --+ , Ky 
have a distance 2 167 from each other. The lemma is established. 

The results, (28), (29) and lemma 6, of this piecewise deformation are sum- 
marized in ‘ 

TuzorEM 10. Let » be a surface in Y not a minimal surface, and a any 
value = Diy). There are two positive constants 6 and 8, and a piecewise deforma- 
tion of Ba in itself which yields 


Pam P 


(the homology taking place in Bq and applying to any Vietoris cycle on B,). Here, 
P is a subspace of Ba consisting of surfaces 3 of the form 3 = f(x, ti, te, «++ , &), 
where t,, » = 1, 2,---,k, takes the value 0, or values between 0 and 1, or the 
value 1 according to the situation of x in Ba, and f(x, t:, --- , ti) has the following 
properties : 
1) f(t, 4, ++: , te) ts continuous in x, tt, --- ,t. ; and f(r, 0, 0,---,0) = xr. 
2) If|x — | > 6, thent,, ---, t take only the value 0. 


3) If |x — »| = 6, then D[3] < Dix] — Bo ty. 


4) If|t — 9| < 4, then Dij] <= Dix] — 8B. 
The above theorem applies automatically to $’ andQ as weil as $. Through- 
out theorem 10 replace $ and P by Q and Q. 


11. The Main Theorems. Remarks 


On the basis of the reducibility condition (Theorem 8 or Theorem 9) and the 
variational condition (Theorem 10), it is easy to prove that on each k-cap with 
cap limit a there is a minimal surface x for which D[r] = a. For the meaning of 
these terms, and such a proof, see [6] and Theorem 6 of [12]. This establishes 
the validity of the Morse theory. 

Matin Toeorem I. Let T;, Te, --- , Uyibe k non-intersecting closed rectifiable 
Jordan curves in space. Suppose that the reducibility condition of §8, page 213 
has been established for each curve 1’, individually. Then the Morse theory applies 
to the minimal surfaces (degenerate as well as ordinary) bounded by T1,--- , Tx. 

There remains the question of determining the connectivity numbers of §, 
where only those Vietoris cycles are considered which lie on $y for sufficiently 
large N. This is reduced by Theorems 2, 3 of Part I to the corresponding ques- 
tion for $". The connectivity numbers of " have been determined for rather 
general classes of curves I in [12], [7], [8]. They are: R) = 1, Ri = Ro = --- = 
R, = --- = 0. Hence in these cases the connectivity numbers of § are like- 
wise R) = 1,Ri = Rp = --- =R, =°-:: = 0, by theorems 2, 3. 

Matin TuroreM II. Let T,, T2,--:, T% be k non-intersecting simple closed 
polygons in space. Then the Morse theory applies to the minimal surfaces (de- 
generate as well as ordinary) bounded by T:,-:-,V%. In particular, if M, ts 





Rind we ms 
i ers 


- Sg 


— OL AI. OE PS OTRO. * 
~~ geeher gee “ 





222 MAX SHIFFMAN 


the sum of the n™ type numbers of all blocs of minimal surfaces bounded by lr, 
T2,-°:+:,I%, and if each M, is finite, then 


Mo = 
'~M,— Mo 
(30) 


M,. 1 Mr-1 + anti + (—1)"Mo = (—1)", 





Also, M, = 0 for all n > 3k — 6 + V, where V is the total number of vertices 
in all the polygons T,, uw = 1, °°: ,k. 

Main Theorem II is a result of Part I, and §§$9, 10. No use of §§7, 8 need 
be made. 

An application of the Main Theorem I is to the case discussed by Morse and 
Tompkins in [7]. If each I, has the property (18), page 215, the Morse theory 
and the inequalities (30) apply. 

Since degenerate as well as ordinary minimal surfaces are included in the 
main theorems above there are many more possibilities than in the case of one 
boundary. This leads to problems of the following kind. Suppose that a 
degenerate minimal surface x consists of two pieces, one r’ bounded by a set 
(I’) of the boundaries and the other r’’ bounded by the remaining set (I). 
What is the relation, in the form of inequalities, between the Morse type of r 
in the space $‘"? and the Morse types of x’ in B"” and of r” in BP"? 


COLLEGE OF THE City or NEw YORK 


BIBLIOGRAPHY 


. Courant, R., Plateau’s problem and Dirichlet’s principle, Annals of Math., 38 (1937), 
pp. 679-724. 
2. Courant, R., The existence of minimal surfaces of given topological structure under 
prescribed boundary conditions, Acta Math., 72 (1940), pp. 51-98. 
3. Courant, R., Critical points and unstable minimal surfaces, Proc. Nat. Acad. Sci., 
27 (1941), pp. 51-57. 
. Dovatas, J., Solution of the problem of Plateau, Trans. Amer. Math. Soc., 33 (1931), 
pp. 263-321. 
. Dove tas, J., Minimal surfaces of higher topological structure, Annals of Math., 40 (1939), 
pp. 205-298. 
. Morse, M., Functional topology and abstract variational theory, Mémorial des Sci. 
Math., vol. 92 (1939). 
. Morss, M., anv Tompkins, C., The existence of minimal surfaces of general critical type, 
Annals of Math., 40 (1939), pp. 443-472. 
. Morse, M., anp Tompkins, C., Minimal surfaces not of minimum type by a new mode of 
approximation, Annals of Math., 42 (1941), pp. 62-72. 
. Morss, M., ano Tompkins, C., Unstable minimal surfaces of higher topological structure, 
Duke Math. Journ., 8 (1941), pp. 350-375. 
. Ravé, T., On the problem of Plateau, Ergeb. der Math., vol. 2 (1933). 
. SuirrMan, M., Bull. Amer. Math. Soc., 44 (1938), p. 637, abstract of a paper read to the 
society in Sept., 1938. 
. SuirrMan, M., The problem of Plateau for non-relative minima, Annals of Math., 40 
(1939), pp. 834-854. 








Ty 


lices 
eed 


and 
ory 


the 
one 
t a 
set 


we? 
fr 


37), 
vder 
Sci., 
31), 
39), 
Sci. 
/pe, 
e of 


ire, 


the 


40 





ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


ON THEORIES WITH A COMBINATORIAL DEFINITION OF 
“EQUIVALENCE” 


By M. H. A. NeEwMan 
(Received June 23, 1941) 


1 


The name “combinatorial theory” is often given to branches of mathematics 
in which the central concept is an equivalence relation defined by means of 
certain “allowed transformations” or “moves.” A class of objects is given, and 
it is declared of certain pairs of them that one is obtained from the other by a 
“move”; and two objects are regarded as “equivalent” if, and only if, one is 
obtainable from the other by a series of moves. For example, in the theory of 
free groups the objects are words made from an alphabet a,b, --- ,a',b', «+=, 
and a move is the insertion or removal of a consecutive pair of letters xx" or 
zz. In combinatorial topology the objects are complexes, and the allowed 
moves are “breaking an edge”’ by the insertion of a new vertex, or the reverse 
of this process.. In Church’s “conversion calculus’” the rules II and III are 
“moves” of this kind. 

In many such theories the moves fall naturally into two classes, which may be 
called “positive” and ‘‘negative.”’” Thus in the free group the cancelling of a 
pair of letters may be called a positive move, the insertion negative; in topology 
the breaking of an edge, in the conversion calculus the application of Rule II 
(elimination of a \), may be taken as the positive moves. In theories that have 
this dichotomy it is always important to discover whether there is what may be 
called a “theorem of confluence,”’ namely, whether if A and B are “equivalent” 
it follows that there exists a third object, C, derivable both from A and from B 
by positive moves only. A closely connected problem is the search for “end- 
forms,” or “normal forms,”’ i.e. objects which admit no positive move. It is 
obvious that in a theory in which the confluence theorem holds no equivalence 
class can contain more than. one end-form, but there remains the question 
whether in such a class any random series of positive moves must terminate at 
the end-form, or whether infinite series of moves may also exist. 

The purpose of this paper is to make a start on a general theory of “sets of 
moves” by obtaining some conditions under which the answers to both the above 
questions are favorable. The results are essentially about “partially-ordered”’ 
systems, i.e. sets in which there is a transitive relation >, and sufficient condi- 
tions are given for every two elements to have a lower bound (i.e. for the set to 

ebe “directed”) if it is known that every two “sufficiently near” elements have a 
lower bound. What further conditions are required for the existence of a 
greatest lower bound is not relevant to the present purpose, and is reserved for a 
later discussion. 





1 See Alexander [1] and Newman [1]. 
* See Church [1] and references there given. 


223 





224 M. H. A. NEWMAN 


As an application the normal form theorem of Church and Rosser [1] in the 
conversion calculus is derived. 


2 


We are concerned with two kinds of entities, ‘objects’ and the ‘“‘moves”’ per- 
formed on them, and each move is associated with two objects, ‘initial’ and 
“final.” We are therefore dealing essentially with indexed 1-complezes (in 
which, therefore, a positive sense is assigned in each 1-cell), the vertices being 
the “objects,” and the positive 1-cells the “moves.” It will be convenient to 
make use of this topological terminology.’ The incidence relations are in no 
way restricted: there may be many cells with the same vertices, and the initial 
and final vertices of a cell may coincide. In diagrams the positive 1-cells slope 
down the paper, and some of the terms used are chosen accordingly. 

Vertices are denoted by italic letters, cells (the single word is used from now on 
for “positive 1-cell”) by the letters & 7, ¢, w with various suffixes. “xyy” 
means ‘‘there is a cell with initial vertex x and final vertex y.’”’ An ordered set 
of cells & , &,---,é&, form a path x if there are vertices x, %1, --+ , 2% such 
that z;; and 2; are the vertices of &; for 1 S$ 7 k. The cell é; is direct or 
reversed in w according as it runs from 2x;_; to x; or from 2; to z;-1 , and the path 
is denoted by e1é1 + erg + +--+ + exé, where e; is +1 as &; is direct or reversed. 
If there are no reversed cells, 7 is a descending path. It is convenient to regard 
a single vertex, x, as a “null path” with x as initial and final vertex. A vertex 
which is not the initial vertex of any cell is a minimal vertex, or end. 

If there is at least one non-null descending path from x to y we write x > y. 
z is a lower (upper) bound of x and yifx = zandy = z (if z = x andz 2 y). 


3 


Expressed in this terminology the confluence property is 
(A) If x, and x2 are connected by a path in the indexed complex = they have a 
lower bound. 


By a simple induction on the number of cells in a path from 2; to x2 this property 
can be deduced from the following special case of it: 

(B) If x1 and x2 have an upper bound they have also a lower bound. 

This in its turn is easily deduced from the still more special form (C): 

(C) If aux, and a > x2, x and x2 have a lower bound. 

The transition from (B) to (C) is a step towards localizing the property, and 
the theorems that will be proved in this paper give conditions in which the 
localization may be completed, i.e. in which (A) may be inferred from the fol- 
lowing condition (holding for all a, x; and 2): 

(D) If aya, and apr, , 2, and x2 have a lower bound. 

Nore. The cell and vertex terminology, although the most convenient for 


* 





§ The notions that arise are closely related to those of the theory of partially ordered sets, 
but usually not identical. Except in the case of identity the terms of that theory are 
therefore avoided. 








he 


“COMBINATORIAL THEORY” AND “EQUIVALENCE” 225 


our purpose, may suggest that “ayy” implies that y is a “next” vertex below 2. 
Actually the force of » is that y is an element satisfying y < x, and lying in a 
certain neighborhood of zx. For example, all the conditions (A) to (D) are 
satisfied if the vertices are taken to be the points of a vertical plane, and the 
positive 1-cells the downward sloping directed segments of length less than 1. 


4 


The simplest way of strengthening (D) so that it implies (A), is to require 
that paths descending from 2, and 2 to their lower bound shall each contain one 
cell; or, in terms of moves, that if two moves are possible on an object X, they 
ean also be performed one after the other, and give the same result in either 
order. 

THEoREM 1. Let = be such that if aux and apy, and x ¥ y, there exists b such 
that xub and yb. Then property (A) holds. 

Let “xvy’”’ denote “ayy or x = y.” We prove that if aux and auypyou +--+ ye , 
there is a by such that xvbyvbev --- vb, and y,vb;, ,—a stronger form of (C). 
Suppose this proved for k — 1 (the case k = 1 following immediately from the 
datum), and let xvbyybov --+ vbga, and yxivbei. If yro = bes take b, to 
be yx. If yxmbsro, since also yxsyyz there exists a b, such that b,1vb, and 
y.vb, ; and this completes the induction. 

CoroLuary 1.1. The theorem remains true if “‘xvb and yvb”’ is substituted for 
“ub and yub” in the enunciation. (No change is needed in the proof.) 

This almost trivial result is sufficient to settle many of the more familiar 
theorems of the kind that we are considering. In the “word groups’ already 
referred to, a move is to be regarded as completely determined by the initial 
and final words, (so that e.g. zz ‘x — x is regarded as the same move whether 
the first or last two letters are cancelled). Hence two pairs xx and yy” 
(where x and y may be of the form uw’) in the same word W, that give rise to 
different possible moves on W, have no common letter and give the same result 
if cancelled in either order. Since every series of positive moves (cancellations) 
terminates it follows that all such series starting from a given word W lead to a 
common end-form. 

Theorems of the Jordan-Hdélder type also belong to this category. The kernel 
of these theorems is a theorem on modular lattices (say with the partial ordering 
> and the operations V and A). If X, Y, Z are consecutive elements in a 
descending chain, 5, in such a lattice let the chain S’ obtained by substituting 
Y’ for Y be said to be directly related to S (S’ dr S) if X = X V Y’ and 
Z=Y A Y’; and S’ shall be related to S if it is obtainable from 5 by a suc- 

cession of such steps. The theorem in question is then that from any two finite 
descending chains, § and §', from A to B, a pair of related chains S, and 5; , 
can be obtained by the insertion of a finite number of additional terms in 5 and 5’ 
respectively. This is evidently a “confluence” theorem. To apply 1 we take 
as a typical vertex of = the class [S] of all chains related. to a chain 5, and as a 
positive 1-cell the ordered pairs of classes [S:], [S2], where 5: is obtained from 5, 


me ae 


ay 


+ th ome 








226 M. H. A. NEWMAN 


by the insertion of one additional term,—say P—between X and Y. Then if 
S; dr S,, the insertion of a suitable term in 52 gives a chain 52 related to Si ‘ 
and hence more generally any member of [5;] can be made into a chain related 
to 52, by the insertion of one suitable term. Two successive “positive moves” 
on [5;] can therefore be represented by two successive insertions of new elements 
in the same chain 5, and evidently the order in which they are inserted does 
not affect the result. The system therefore fulfils the conditions of Theorem 1. 
But any two chains descending from A to B have an “upper bound” in 3, 
namely the class [AB]. Therefore they have a “lower bound,” and this is the 
required result. 


5 


In these examples it is obvious that if an end-form exists it is reached by ran- 
dom descent. This is necessarily so in all systems with non-interference of 
moves: 

THEOREM 2. Under the conditions of Theorem 1, if there is a descending path 
of k cells from a to an end e, no descending path from a contains more than k: cells. 

If k = 1, 2 cannot contain a cell ay with y ¥ e, since if it does b exists such 
that yub and euwb, and eis not anend. In the general case let w be a descending 
path & + & +.--- + & joining a to e, and let m + m + --- + 7; be any 
descending path from a. Let & and m be cells az and ay. If x = y it follows 
immediately from an induction that 7 < k. If not, let the cells ¢ and w descend 
from x and y to the common vertex w. By Theorem 1 there is a descending path 
o from w to a vertex S e@, i.e., since e is an end, to ¢ itself. Since &+---+& 
has k — 1 cells, ¢ + o has, by an inductive hypothesis, at most k — 1 cells; 
therefore w + o, and finally also m2. + --- + ;, have at most k — 1 cells,— 
le.j Sk. 

Corouiary 2.1. Every descending path from a is part of a descending path of k 
cells from a to e (i.e. there is “random descent’ to e). 

That Theorem 2 and Corollary 2.1 fail if the condition is weakened as in 
Corollary 1.1 is shown by the example in Fig. 1, (positive cells slope downward). 

The main criteria for “confluence” are established in Theorems 3, 4, 5, and 9, 
all of which are independent. It is Theorems 5 and 9 that are used in the 
application to the conversion calculus. 

THEOREM 3. In an indexed complex in which all descending paths are finite, 
(D) implies (A). 

(Note that in such a complex “>” is a proper ordering, since if x > x an 
infinite descending path is obtained by going round and round the re-entrant 
path from x to zx.) 





4 Namely, if X and Y are in §;, insert P itself; if XYZ and XY’Z are consecutive terms 
of 5: and Sj respectively, insert P’ = Y’ A Pin S{ ; if UXY and UX’Y, insert P”’ = 
X'VP. Itis easily shewn that in the second case XPYZ is related to XY’P’Z, in the third 
UXPY to UP’ X'Y. Cf. Birkhoff [1] p. 37. 








n if 
3 
ited 
yes” 
ents 
loes 
nl. 


vy 
| 3, 


the 


- of 


ath 
lls. 
ich 
ing 
ny 
yWws 
ond 
ath 
~ Eb 
lls; 


fk 


in 
d). 
19, 
the 


ite, 
unt 


rms 


j 


ird 


“COMBINATORIAL THEORY” AND “EQUIVALENCE” 227 


The symbol [£], is used as an abbreviation for & + & +---+ &. It is 
convenient to allow the value k = 0, [Jo being a null path. 

A peak of a path is the common vertex of a successive pair of cells —& + 1, 
(“up” before “down’’). 

Let {é]; and [n], be paths descending from a vertex a to vertices b and c respec- 
tively. Let m be the path —[§]; + [n]., and let it be assumed that paths 
m,7™3,°**, 7, each leading from a to b, have been defined. Let X, be the 
(finite) indexed subcomplex of 2 formed by all the cells occurring in the paths 
m,::',m,- The depth in X, of a vertex x is defined to be the maximum 
possible number of cells in a descending path from a to x in X,, (or 0 if there 
is no such path). Thus the depth of any vertex in X;4: is not less than its 
depth in X;. 

If z, contains no peak, 7,4: is not defined. If it contains at least one peak, 
choose one, say y, of minimum depth in X, among peaks of z,. Let the vertices 


Fic. 1 


immediately preceding and succeeding y on 7, be wu and v, the (positive) cells 
yu and yo being w and w’ respectively. There exist, by (D), paths o and 7, 
(either or both of which may be null), descending from u and v to a common 
vertex w, and 7,4; is formed from 2, by substituting ¢ — 7 for —w +’. The 
effect is to replace the peak at y by at most two new ones, of depths in X,41 
at least 1 greater than that of y in X, (or zero). By a simple induction it 
follows that 7, has at most r peaks; and if we make the inductive hypothesis 
that the peaks of a are of depth at least n in Xm it follows that, if r < 2", 
at most 2" — r peaks of aon, are of depth n or less in Xon4,. Thus the induction 
is complete and it is proved that if r => 2” all peaks of 1, are of depth at least n. 
If |g], is a descending path of maximum length in X, from a to a given vertex z, 
where r = 2”, ¢; belongs to Xo: for i = 1,2, --- ,m. Suppose that, for a certain 7, 
X; is the first of the X’s to contain ¢; , where j > 2°. Then [f]; is a descending 
path in X; of maximum length to its final vertex, z; , since any longer one could 








228 M. H. A. NEWMAN 


be used as part of a longer descending path to z in X,. Thus the depth of z; 
in X;is i. Since ¢; belongs to X; but to no earlier X, it is a cell of one of the 
descending paths that eliminate a peak, y, in the formation of 7; from z;_,. 
By the result of the preceding paragraph, if the depth of y in X;1 is p,j — 1 < 
2”"'. But the depth of z; in X; exceeds that of y in X; by at least 2,7 = p + 2. 
Therefore j — 1 < 2°',j < 1+ 2°" S 2‘, contrary to the hypothesis. 

The series of paths ™, ™2,-°-- , terminates. If not choose, for each n, a 
maximal descending path, o, , in Xo from a to a peak of mn. Since the first 
cell of each of these paths is in the finite complex X2 there is at least one cell, 
w; , which is the first cell of o, for an infinity of n. Since the second cell of each 
of this infinite subsequence is in X, there is at least one cell, w:, such that 
w, + w is the beginning of an infinity of the o,. Continuing in this way we 
obtain an infinite descending path w, + we + --- in 2,—contrary to its given 
property. 

Thus the series of paths z, from b to c terminates in a path 2, , which, since 
it has no peak, must descend or ascend directly from b to c, or else descend 
from b to a vertex w and then rise to c. 

The finiteness condition imposed on descending paths in Theorem 3 cannot 
be replaced by the corresponding ‘‘completeness” condition, that every descend- 
ing chain of vertices has a lower bound in 2. This is shown by the complex 
in Fig. 3, in which the vertices c and d are lower bounds of all sets of vertices 
not containing either of them; but c and d have themselves no lower bound. 


6. Topology of = 


The complex » can be made into a 2-complex, =’, by adding a 2-cell bounded 
by each of the l-cycles w + « — 7 — »’ occurring in the proof of theorem 3 
(one for each z,). Every component of 2° is simply connected. Any two paths, 
mx and 7’, connecting vertices a; and b; are deformable, by the method of Theo- 
rem 3, into paths «1 — 7 and 0; — 7; respectively, where o; and 7; descend to a 
vertex a2 , o; and 7; to by ; and if ~ stands for “is deformable into,” —7m1 + ne 
o2 — 72, and —o1 + 0; ® o2 — 72, where o2, 72, 02, 72 are descending paths, 
the first two to a3 , the second two to b;. In this way paths o, and 7, descending 
tO Gn4i, and o, and 7, to b,41 , are defined for every n. If an infinity of different 
paths descending from a; could be made from the o; , 7; , ¢; and 7; , an infinity 
of them would necessarily contain one or other of o1 , «1 ,—say «1 ; and of these 
an infinity would contain one or other of o2, 02 ,—say o2; and so on. The 
descending path o; + o2 + --- so constructed would have an infinity of different 
paths as subsets, and would therefore be infinite, contrary to the postulated 
property of 2. The number of different paths must therefore be finite. 

It follows that for some m, om = Tm = Om = To = 0; i.e. 


I iam 4 , / ’ 
TF eH 1 1 — 01% On — Ta + Tm — On = O. 








f ra 
the 


mA * 


rst 
ell, 
ch 
lat 
we 
en 





“COMBINATORIAL THEORY” AND “EQUIVALENCE” 229 


7 


To establish our second criterion, we suppose that > is the sum of two sub- 
complexes,’ L and R, and shall use the terms “Z-cell,” “R-path,” ete., in an 
obvious sense. “axdy” and “zpy’’ shall mean that there is a cell zy in L or R 
respectively, and xLy and xRy that x > yin Lor R. (In diagrams the positive 
L- and R-cells will slope down towards the left and right respectively.) We 
denote by Q the following property of 2. 

(Q) If xy and xpz there exists a vertex w such that zLw, and either y = w or yRw. 
(We require zLw, which is not necessarily implied by z = w. The possibility 
that y = z is not excluded.) 

TurorEM 4. If, in a complex with the property Q, all L-paths are finite, then 





"q 
I 


é,, 


Fie. 2 


if «Ly and xRz there exists a vertex w such that zLw and either y = w or yRw. 
If all L-paths are finite, then, in property (Q), z ¥ w. 

It is sufficient to prove the theorem when xdy, the general case then following 
by induction. Let 7 be an L-cell from x to y, and [¢], an R-path from z to z. 
Let 7: be —n + [¢], , and suppose, inductively, that a path x, from y to z has 
already been defined. 

If r, has no peak z,4; is not defined; otherwise let vp) be the last peak on 7, , 
from y towards z. We assume, inductively, that in proceeding from y towards z 
the direct (“downward’’) cells of z, are in R and the reversed cells in L,—an 
assumption evidently satisfied for r = 1. The part » --- z of z, is of the form 
[w], — o where [w], is a descending R-path and o a descending L-path. o may 
be null, but g ¥ O since v is a peak. Let & be the predecessor of w: in 7,, 


* This always means “indexed subcomplex,”’ the positive direction in each 1-cell agreeing 
with that in >. 





230 M. H. A. NEWMAN 


(and therefore an L-cell). Assuming inductively that &, is defined, for some 
m & q, as an L-cell with the same initial vertex as w» , let om and om be the R- 
and L-paths which, by (Q), descend from the final vertices of &m and wm to a 
common vertex. Then c, is not null, and we define ém41 to be its first cell: 
say om = £m41 + tm. The path z,41 is now defined to be the result of substi- 
tuting 0. — m1 + 02 — t2 +°°* tog o, for —& + [w],. It evidently has 
the property that reversed cells are in L and direct cells in R, and the inductive 
definition of z, is therefore completed. 

If vq is the final vertex of w,, and —o; is the portion v, --- z of a, 0, isa 
descending L-path from z. The corresponding portion of —a41 is of + a, 
with at least one more cell. Since all L-paths are finite it follows that the process 
of constructing paths 7, terminates after a certain number, k, of steps, i.e. 7 
has no peaks and is therefore a descending (possibly null) R-path from y to a 


Fig. 3 


vertex w, followed by an ascending (non-null) Z-path from w to z. Thus yRw 
or y = w, and zLw. 

Coro.tuary 4.1. Jf (Q) is strengthened by excluding the possibility y = wu, 
Theorem 4 may be strengthened in the same way. (Obvious from the method of 
proof.) 

CoroLuary 4.2. A descending l-path and a descending R-path have at most 
one common vertex. If the two paths have their initial and final vertices, a and b 
in common, i.e. if aLb and aRb, there is a vertex c such that bLc and bRc (the 
alternative b = c being impossible in this case); and a vertex d such that cld 
and cRd; and soon. The patha---b---c---d--- is an infinite descending 
L-path. 

In particular a cell cannot be both an L- and an R-cell. This does not mean 
that the condition (Q) could be weakened in Theorem 4 by adding “if y # 2” 
at the beginning. That this would make the theorem untrue is shown by the 
example in Fig. 3, where segments sloping down towards the left and right 
belong to L and R respectively, and the cells marked b;c are in both L and R. 








me 
» R- 
0a 
ell: 
sti- 
has 
Live 


is a 
/ 


28S 
. Tk 
0a 


Rw 


Uv, 


ost 
1b 
he 
Ld 
ng 


un 
” 


he 
ht 





“COMBINATORIAL THEORY” AND “EQUIVALENCE” 231 


The condition (Q) is satisfied for pairs with y ¥ z,and no descending L-path has 
more than two cells; but ¢ and d have no lower bound. 

Theorem 4 also fails if the alternative “z = w”’ is allowed in (Q). This is 
seen by omitting the vertices b; in Fig. 3 so that each ajc becomes a single R-cell 
(but not now an L-cell). 


8 


We return to the consideration of indexed 1-complexes in general. A set of 
(positive) cells, Ez, is at x if x is the initial vertex of every member of E,. 
(The null-set is at every vertex.) We suppose that if ¢ is a cell xz, each cell 7 
at « has a finite set of cells at z, called 7 | &, assigned to it as its &derivate.° 
The &derivate, H, | &, of a set E, is the logical sum of the derivates of members 
of E,. (An appearance of the symbol £, | — implies that ¢ is at x.) If risa 
descending path from x, E, |, the derivate of EZ, by continuation along 7, is 
defined inductively by the equation 


E.| (w + &) = (Ez| x) | &; 


and if is null, Z,| 7 = E,. We usually write E | (x + £) without brackets: 
E\r+& The path [£]m is a development of E, if, for 1 < i S m, & € Ez | [éin. 
The development is complete if Ez | [E]m = 0, partial if not. 

The letters CD are used as an abbreviation for ‘complete development.’ 
We postulate the following conditions on the derivates: 

(Ai) 9 | & ts null if, and only if, n = &; 

(d2) ifn = §, (nl EN 1H = 0; 

(As) of » and ¢ are distinct cells at x, there exist developments x, and x; of n| ¢ 

and ¢ | » respectively, with a common final vertex w. 

(As) with the notation of (As), —| (n + xe) = &| (& + «,), for any & at x. 

It follows from (A4), by summation, that the derivates of any set ZL, by con- 
tinuation along » + x; and ¢ + «, are the same. A further consequence is that 
k, and x; are complete developments of | ¢ and ¢| 7 respectively. For 


(n|S)|m = alo +% 
=alatn =0. 


From (Ae) it fs!!ows by induction on the length of z that if E. 1 E? is null, 
(E; |r) N (2 | +) is also null. 

Lemma 1. If x is a development of E., and E?| x © E}| 7, then E; © E;. 

Let x be [E]m. Letj be the least integer such that E | [é]; © Ez | [é];. If the 
lemma is false 7 = 1, and E? | [é];-. contains a cell ¢ not in E}| [é];.. Hence 
{ ¥ ;, and ¢|£; is a non-null subset of E; | [¢]; not contained In E; | [é]; , con- 
trary to the hypothesis. 

In particular if E? | = 0, E2 CE}. 

It is assumed, further, that a relation J holds between certain of the pairs of 
KS 


* For an illustrative example of derivates see §13. 





ae 


232 M. H. A. NEWMAN 


cells at a vertex, and a set EZ, is defined to be a J-set if £Jy and nJEé for every 
distinct pair £, 7 in Z,. (Thus all sets with less than two members are J-sets.) 

(J:) If Jn, &| 1 has precisely one member. 

(Jo) If meé&|f and m€&|f, and of f&iJk& or i = &, then mJm or m = m. 
It follows from Jz that if E is a J-set, E | £, and more generally E£ | z, is a J-set. 
From J; it follows that for no é does éJé. 

It is now agreed that a set denoted by E, E., etc., shall be finite. (A CD of 
any set is finite by definition.) 

Lemma 2. If the J-set E has k members, all CD’s of E have k cells and the same 
final vertex, and all partial developments are parts of CD’s. 

If » and é are in E, n | — has one member if  ¥ &, and none if 7 = &. Thus 
E | is a J-set with k — 1 members, and the development comes to an end after 
k steps. 

Let » + o and ¢ + 7 be any two CD’s of E, ending at y and z respectively; 
and let » + & By Ji and As; the sets 7 | ¢ and ¢| 7 are single cells, y’ and ¢’, 
with a common final vertex, w. By 44, E|n + ¢’ = E|¢ + 7’, a set with 
k — 2 members. Let 7 be a CD of this set, ending at wu. By an inductive 
hypothesis ¢’ + « and o, being CD’s of E | n, a J-set with k — 1 members, have 
the same final vertex: u = y. Similarly u = z, and so y = z. 

Lemma 3. If E, is a J-set, and Ez any set at the same vertex x, all derivates 
of Ez by continuation along CD’s of E, are identical. 

With the notation of the previous lemma, 


E.|\n+o=Ez|n+¢' +72, (inductive hypothesis), 
= E,|¢+ 0 +7, (Ad), 
= Ei |¢+7, (inductive hypo thesis). 


9 


If Ez is a J-set, E,| E: denotes the continuation of E, along a CD of E;. 
By Lemma 3 it is independent of the CD chosen. E,| E} + E? + --- + Etis 
defined inductively to be (E,| EB: + --- + EX*)| (BE) Ht +--- + EF). 
Thus if [é], is a CD of E; , E,| E; and E, | [|, are alternative notations for the 
same set. 

We now come to the main results of the paper. All the conditions A and J 
are purely local, and involve only a fixed number of given cells. 

THEOREM 5. Let m and m2 be paths in a 1-complex with the properties Ji-3 and 
Ai-A, descending from a to vertices b and c. Then there exist paths x3 and m 
descending from b and c to a common vertex d, such that if E, is a set at a, 


Ea|m™ + 3 = Eq| m2 + ™. 


We first prove the following special case. 
Lemma 5.1. If Ej and E? are J-sets, the CD’s of E} | E2 and Ez | Ez have the 
same final vertex, and if E, is any set at a, E,| E, + Ej = E,|E2+ E:. 








very 
ets.) 


™. 
set. 


) of 


ame 


fter 


the 





“COMBINATORIAL THEORY” AND “EQUIVALENCE” 233 


(Since all sets called “E”’, “FB”, etc., in the following proof are at a, the suffix 
a will be omitted. 

‘Case 1. E' NE’? = 0. Let n(E) denote the number of elements in E. We 
proceed by induction on m, = n(E" | E”) + n(E’| E'). Excluding the trivial 
ease where one set E’ is nell, the minimum possible value of m is 2. This 
minimum is only attained if n(E") = n(E’) = 1, and Lemma 5.1 then follows 
from d;and A,;. We may therefore assume that m > 2, and also that n(E’) > 1 
or n(E’) > 1, —say E’ = E* U », where E* ¥ 0. 

The proof depends on the fact that if £ is not in EZ, n(E | £) = n(E); and hence 
if E? and E“ are J-sets satisfying E? M E* = 0, and E” C€ E’, then n(E” | E”) < 
n(E” | E*). Thus n(E’ | E*) S n(E*| E’) and n(E*| BE’) < n(E'| BE’). There- 
fore, by the inductive hypothesis, CD’s of E”| E* and E*| E” have the same 
final vertex, z. Since E’ is a J-set, 7 | EZ’ is a single cell, £, and 


(h(E \|E)=,7|F +E =n\|E°+E (by the inductive hypothesis) 
= E: | E’ + E*, 

since E*| E’ + E* = E*| E* + E? = 0. Thus there is a CD of E’ | E’ consist- 
ing of a CD of E’ | E” followed by a CD of £| (E’| E’). Since E’ | E’ is not 
null it follows that n(é | (E” | E*)) < n(E’ | E’), and since also (E’ | EB’) |& = 
E’ | E' the inductive hypothesis may be applied to the sets £ and E” | E’ at z. 
The final vertices of CD’s of (E’ | E’) | &, ie. E’| E’, and of &| (E’ | E’) are 
therefore identical, and the latter set has been seen to be the end portion of a 


CD of EZ’ | E’. The first part of the induction is therefore complete. If E is 
any set at a, 


E|\PF +B =E\FP+E +72, 
= E\|E°+ E+, (by the inductive hypothesis applied to E’ and E’) 
= E\E*+7+£’, (by the inductive hypothesis applied to ¢ and E” | E’) 
= E\E' + BE’. 


Case 2. As Case 1 save that E'NE’? #0. Let E' = E UE‘, fori = 1,2, 
where E°M E* = 0. By Case 1, applied to E*| E° and E*| E’, E*| E° + E* 
and E* | E° + E* have the same final vertex, and since E° | E° = 0 these two 
sets are E | HE? + E* and E'| E° + E*, ie. E’| E' and E' | E’. 

In the general case, to which we now turn, the result may be stated more 
explicitly as follows, taking 71 and m2 to be [n]; and [{]. . 

Lemma 5.2. If [n]; and [f], are any paths descending from a, to b and c there 
8 paths o,, and 7,, , (possibly null, r = 1,---,j+1,s =1,--:,k +1) such 
that 

(1) Ns = O11, fr = Tri 

(2) or41,2 and 7;,241 have the same final vertex, 

’ This proof of Case 1 was suggested by Dr. J. H. C. Whitehead, in place of one based on 
Theorem 4. 





teen SS 


fat Pa en: 


1 EEE ne TL Sal 4 


ee 


CO SF sg 


oe TL ED pen 





Pe ee 


234 M. H. A. NEWMAN 


(3) or, is a CD of Er., = ne | tis + 22 +++ + Tr12, and ty of Et, , = 
f | on + +e + Or,s—1 + 

(4) for any Ea, Ea|m + ties tee + ties = EBa| me + ofa tee + 
Oj+1,k + 

Starting from the two given paths [y]; and [¢], we add, one by one, the pairs 
of paths o,4:,, and 7;,.4: for the couples (r, s) in the standard “triangular” 
order (1, 1), (1, 2), (2, 1), (1, 3),--:. When the time comes for o,4;,, and 
7,241 to be added, the paths o,, and 7,,;, descending from a vertex 2,,, and 
corresponding to the earlier couples (r,s — 1) and (r — 1, s), have already been 
constructed as CD’s of E}, and E*,. Hence by the cases of Theorem 5 already 
settled, CD’s E}, | 7-s and E*, | ors, i.e. of E} 4, and E*4;,., meet at a common 
vertex. These CD’s are 7,,.41 and o,41,.; the induction is complete. (In the 
limiting cases r = 1 and s = 1 the single cells 7, and ¢, play the parts of E; and 
E? in the earlier cases.) 

The proof just given provides a method of deforming 7 + 73 into m:. + 7, 
by a series of steps in each of which a path 7,, + o,41,s is replaced by a path 
Ore + Trs41- By Lemma 5.1 a set E, at the common initial vertex of g,, 
and 7,;, When continued along either of these paths gives the same result, and 
therefore the continuation of Z, along the whole path is unaffected by a single 
step. 

THEOREM 6. Any two CD’s of a (finite) set E, have the same final vertex. 

If [n]; and [¢], , ending in b and c¢, are the developments then, with the nota- 
tions of Lemma 5.2, since 7 € Ez | [n]sa, 


x. om E, | [n]s—1 + T1s + a i cies + Tr—1,8 5 
= E, | (¢]-—1 + Ga t+ eee + Or,s—1 


and therefore o,1; + 02 +--+ is a development of £, | [f],1 ; and in particular 
Or41,1 + Or412 +++ is a development of EF, | [f]., = 0, since [f]}, is a CD. 
Thus c = d, and similarly b = d. 

Coro.uary 6.1. Continuation of a set E, along any two CD’s of a set E; gives 
the same result. This now follows from Theorem 5, 73 and zm, being null. 

Corollary 6.1 cannot be extended to give the general monodromy property, 
“continuation of E, along any two descending paths from a to b gives the same 
result.”” Consider the 1-complex in Fig. 5, in which the vertices marked x are 
identical. Derivates are defined by parallel displacement downward, except 
that the derivates of zy and xz at z and y are zw and yw respectively. All sets 
are J-sets. The conditions A and J are satisfied in this complex, but continua- 
tion of ab to x via b gives the null set, via c the cell xz. 

THEOREM 7. In a complex satisfying, J and A, all developments of a finite set 
E, are finite. 

Every set is a sum of J-sets, namely its individual members. We proceed by 
induction on the smallest number, k, of J-sets, Ei , whose sum is the given set 
E,. (The case k = 1 is Lemma 2.) 








Il 


+ 


airs 
lar” 
and 
and 
een 
ady 
non 

the 
and 


- 
ath 
y= 
and 
ngle 


ota- 


ular 


~ 
/ . 


ves 


rty, 
ume 
are 
ept 
sets 
ua- 


set 


set 


“COMBINATORIAL THEORY” AND “EQUIVALENCE” 235 


There is at least one CD of E,, namely [co], , where o, is a CD of the J-set 
E;| [c],1. Suppose that (1 + f + --- is an infinite development of E,, and 
lel Ges s Tee E', and E?, be as in Lemma 5.1, save that o, replaces 7,. Then 
just as in Theorem 6, 71s + 72 + --- is a development of E, | [oc]... Since, 
for i < k, Ei is annihilated by continuation along o;, E, | [oka = E* | [oh . 








Fie. 5 


Thus ry, + 7% +--+ is a development of a J-set, and so t,x = 0 if r exceeds a 
certain g. Therefore ifr > q, 


0 = Ey =¢|on +°:: + Or,r-1- 


Now on +--+ + o-a-1 is a development of H,, = (E;UE,U---U Es”) 
(1, and hence by Lemma 1, {,¢H,. Therefore the infinite path fou + 
So42 + --+ is a development of H,4:1, a sum of (k — 1) J-sets,—contrary to the 
inductive hypothesis. 
Corotary 7.1. There are only a finite number of different developments of E, . 
If there are an infinity, some cell £ of the finite set #, must come first in an 








236 M. H. A. NEWMAN 


infinity of developments; and some cell & of the finite set Ez | must be second 
in an infinity of these developments; and so on. The path & + &+ --- is 
an infinite development of E, , contrary to Theorem 7. 


10 


Theorem 7 is connected with the problem of “random reduction’. To a 
“normal form” or “end-form” in a system with moves there corresponds an 
end of =, and to a normal form of X an end connected by a path to a given 
vertex x. It follows from Theorem 5 that there is a descending path from r 
to the end, and that a vertex cannot be connected to two ends, i.e. that an 
“object” in the corresponding system cannot have two different normal forms. 
There remains, however, the possibility of an infinite descending path from a 
vertex which is also connected to an end. It will now be shown that this possi- 
bility is not realised in complexes satisfying the conditions A and J. 

THEorREM 8. If, in a complex satisfying the conditions A and J, there is a path 
descending from x to an end e of &, all descending paths from x are finite, and 
all maximal paths end at e. 

That all maximal descending paths from x end at e is obvious in view of 
Theorem 5; only the finiteness remains to be proved. 

Let [n]m be a descending path from z to e, and (if possible) f: + {2 +-:- 
an infinite descending path from x. Let the paths o,, and 7,,, and the sets 
E’, and E*, , be constructed as in Lemma 5.2. Since e is an end all the 7,41 
are null. Let 7 be the largest number such that 7,; is non-null for an infinity 


of values of r, and k a number such that 7,,;4:. = Oifr 2k. Then E%, 541, of 
which 7,,;4: is a CD, is also null, giving E%; | o-; = E*,i41 = 0. Since o;; is a 
CD of E;; it follows (Lemma 1) that, for r = k, 


E’; S Er; = E;; | Tei +°°° + Tr-1,;- 


Thus 7%; + 7e41,; +--+ is a development of E;;, and by Theorem 7 cannot 
be infinite,—contrary to the definition of 7. 

It follows that if a 2-complex >” is constructed as in §5, all its components 
containing ends of = are simply connected. 


11 


The theorems that have been proved indicate that complications will arise 
when the descending paths that join the “y’ and “z” of condition (D) to “w” 
have either more or less than one member each, and that the difficulties are of 
a different kind in the two cases. In the foregoing group of theorems the second 
possibility, (corresponding to —| 7 = 0 for — ¥ n), was excluded. The follow- 
ing theorem allows this possibility, but is in other ways more special than 
Theorem 5, and the meaning of the conditions imposed is less obvious. The 
theorem is used in extending the Church-Rosser Theorem to an enlarged calculus. 

We suppose that derivates are defined in 2, and satisfy A.-A,, but that Ai 
holds only in the weakened form 

(Ar) £| & = 0, and if €| 7 = 0 then nAé, 





ae ef me eae - sew Gs od 





ond 
: is 


0a 
; an 
ven 
1a 7 

an 
ms. 
nm & 
ISSI- 


yath 
r of 


sets 
m+1 
lity 
, of 


not 


nts 


rise 
” 
» of 
ynd 
yWw- 
ian 
"he 
lus. 





“COMBINATORIAL THEORY” AND “EQUIVALENCE” 237 


where 7Aé stands for “either » = é or 7 | — has just one member”. The follow- 
ing additional “J-condition” is imposed, A denoting “not A”: 

(J;) if £An and Jf, then nJ¢. 
(Condition Js does not imply the second half of At, since éJé.) 

Lemmas 2 and 3 remain true under these conditions, and are proved as before. 
The notation E, | Ei + .-- + E* may therefore be introduced for J-sets EF’ . 

TurorEM 9. A complex with the properties AT » AoA, and Ji-J3 has the prop- 
erty (A). 

i is sufficient to prove the following special case, since the extension to the 
general case then proceeds exactly as in Theorem 5. 

Lewua 9.1. If Ei and E% are J-sets, and E, is any set at a, the CD’s of 

E\ | E° and Ez | Ei have the same final vertex, and E,| Ei + Ec = Ea| Ei + E.. 

Let [nls and [¢], be CD’s of E} and E? , n; and ¢; having final vertices b; and 
¢;. “ELJE:” means “nJ¢ if ne E, and ¢¢E?.” From Jz it follows that if 
E,JE@ , (Ea | §)J (Es | €). 

Case 1: ELJE?. We show further that in this case E), | Ez has j cells. First 
letj7 = 1. If also k = 1 the result follows immediately from J; and A;. For 





Fie. 6 


general k, we have (m | £:)J(E2 | ¢1); and since m | {: is a single cell, and Ez | & 
has k — 1 cells, an inductive hypothesis shows that m | E? is a single cell and 
has the same final vertex as a CD, x’, of (E2|¢1) | (m | 1), which by Ay isE? | m + 
&:. Hence xr’ isa CD of E? |m. That E, | E, +E? = £,| E? + E> is proved, 
as in Theorem 5, by repeated applications of Ay. Case 1 for general 7 is now 
completed bd applying the case 7 = 1 successively to 7, and E? | [n],1, for 
r=1,2,--- ,j, and using the last part of the result for r — 1. 

Cask 2: Eh | £2 = 0. We show further that, in this case, E2| Ei has k 
members or less. If 7 = k = 1 the result is clear from J and A. Suppose that 
j = 1, k > 1. Then o%Am ; for if t%Am, by Js and Jo (m | oi) J(E; | 1), and 
hence ‘ Case 1 m | EZ ¥ 0, contrary to the hypothesis. Thus {1 | mis a single 
cell. By a k-induction applied to m | ¢: and E| 1, (in place of EZ, and E>), 
all CD’s of E? | ¢; + m have k — 1 cells or less, and end at c,. Since this set 
is also E; | m + {1, the CD of E?| m is the cell £1 | m, followed by the k — 1 
F (or less), of E27 | {i+ m. The final part follows by repeated applications 
of Ay. 


The extension to general j is as in Case 1. 


OY bade TE 


<A ET RETRO 2 


fe wr a 


_— int a -—s 





AE WS eA 
ee * 








238 M. H. A. NEWMAN 


GENERAL Case. In view of Lemma 3 it may be assumed that if E} | BE? = 0, 
[n]; is chosen so that m| Ei # 0. A series of “zig-zag” paths, m, mm, --- , 
from b; to c , is constructed, m being —[n]; + [¢]).. Suppose 7, already con- 
structed, 


We = 7] — 1+ 1 — °°* + m1 — Om, 


where o; and 7; are CD’s of subsets, E; and E;, of Ei |i and E?| v; respec- 
tively. The y; are descending paths from a satisfying 

(i) Ym is [Sli 

(ii) for any E, at a, Ea|yi + 0: = Bal via + ti-a- 
This whole inductive hypothesis is satisfied by m if m = 2, 7 = o2 = y1 = 0, 
a1 = yo = [n]i, 1 = v2 = [She 

Let 7, have a peak at the join of om—1 and tm-1, and let u be the final vertex 
of tm-1. First suppose that E},1| tm-1=0. By Case 2a CD, 8, of Bis | om-1 
ends at u, and we define 7,4; to be to — o1 + +--+ + Tm-2 + 8 — om, the new 
m’, Tm~2 and on-1 being m — 1, tm-2 + Oandom. The ¥; Up to Ym—2 are the 
corresponding y;, and Yn-1 = Ym. Since @ is a CD of 


En-1 | om & Es lY¥a-1 + Omi 
= Ei | ym-2 + Tm2; 
tm-2 + 0 is a CD of a subset of Ei |ym-2. For any E, ata, 
Ea|Y¥m—1 + om = Eu|¥m + om 
= E,| yma + Tm-1 
= Ba|Y¥m—-1 + om-1 + 0 (Case 2) 
= Ea|ym-2 + Tm-2- 


Secondly let Ens | tm-1 ~ 0. By Lemmas 2 and 3, if a CD of En-1, whose 
first cell, £, satisfies E” | rm_1 ¥ 0, is substituted for om—1, all the conditions 
imposed on 7, remain satisfied, and it may therefore be assumed that o»_1 itself 
is sucha CD. Let tm_1 be [w],, (p ¥ 0 in view of the peak). We construct 
successively, as in Theorem 5, for 7 = 1, 2,--- , p, pairs of descending paths 
tm24i and &°*? + g),_1,;, which are CD’s of a; |é and & | w; respectively, 
and, by A;, have a common final vertex. The notation “ET? + om—14:” 
implies that £“ |; ¥ 0, which is justified by &° | w:-1 ¥ 0, derived ultimately 
from —™ | tm-1 #0. If oma is €? + ona , T4118 defined to be 


/ / , / 1 
to — Ort ee+ + time — Oma + Tmt — *°* + tm24p — Sm—izp — EP? — om. 


Its final ascending part has at least one more cell than c,,. If y; is taken to be 
Yh for h up tom — 2,¥m1 + [wlremar + EO” forh =m —1tom+ p — 2, 
and Ym4p-1 = Ym , all the conditions are fulfilled, the new “m” being m + p — 1, 












— & = BS ee te. Oe Ot Gate cu 


bate hy 








eC- 


ose 
ons 
self 
uct 
ths 


ly, 


” 


ely 


Fae 


be 





“COMBINATORIAL THEORY” AND “EQUIVALENCE” 239 


the new “om” om + E?*? + om+p-1- Condition (ii) follows for the y; im- 
mediately from A, except for 7 = m + p — 1; and 


(p+1) , 
E, | Ym + om + g? + Om+p-1 = E, | Ym—-1 + Tan + "adie + Getons 


/ 


= Ee|ym-1 + [wha + & + ta¢p-2 (As) 


/ / 
= E, | Y¥m-+-p—2 + Tn+p-2 


as required. 

It has thus been shown how, in all cases where 7, has a peak, a path 7,4, is 
to be constructed having either one less peak or a longer final ascending portion. 
Since this portion is a development of the finite J-set E, | [¢]. , the second alterna- 
tive can only occur a finite number of times. Thereafter the number of peaks 
decreases at each step until a path with no peaks is reached. 

The extension to paths which are not CD’s of J-sets now follows as in 
Theorem 5. 


12 


The theorems that follow Theorem 5, as far as Corollary 6.1, are proved 
under the new conditions with only minor changes in the argument. Theorem 
8 fails to survive, as may be seen by considering Fig. 1: all the conditions Af, 
A,-A, , and J;-J3 are satisfied by taking the derivate of a,b at a,4; to be a,4,b, 
and that of @,a,4; at b to be null; and a,bJa,a,4; but not a,a,4,Ja,b. 

Theorem 7 still holds but its proof needs some modification. 

TuEorEM 10. In a complex satisfying Aj}, Ac-As and J,—Js all developments of 
finite sets are finite. 

We proceed as in the proof of Theorem 7. As before it follows that 7, + 
7 + +++ is a development of EF, | [c],-1 and that for some p and gq the develop- 
ment Tpit + Tp42q + ++ Of Ee | [oloa + tig + +++ + Tpq, = Ez | Opq say, is 
infinite, while 7,,41 = Oif r > p. Since the o’s and 7’s are CD’s of J-sets it 
follows from Case 2 of Lemma 9.1 that the number of cells in a,, is (for r > p) 
non-increasing with increasing r. If ¢, belongs to E% | [f],-1, trq is contained in 
Ez | [fa + on + +++ + oq = E%| 6,-1.9 ; by the last part of Lemma 9.1 the 
path ¢,, is a CD of this set, and may therefore be chosen in the form 7,, + org : 
and o,41,q to be org. Thus o;41,¢ has less cells than ,, unless 7,, = 0. If, for 
T > 10, Trq = O whenever ¢, € H3 | [£]-1, 7ro41,.¢ + °°: is an infinite development 
of the sum of the k — 1 J-sets Ei | 6,9 (¢ ¥ q), contrary to the inductive hy- 
pothesis. Hence the number of cells in ¢,, eventually diminishes to zero, and 
from this point on the 7,, coincide with the 7,,.41 and are null,—contrary to the 
initial hypothesis. 

Corotiary 10.1. Under the same conditions, the number of different develop- 
ments of E, is finite. (Compare Corollary 7.1). 


Len pats e~ 


eee ne 





M. H. A. NEWMAN 


13. Application to the conversion calculus 


The formalism first considered is that of Theorems 1 and 2 of Church and 
Rosser [1], but modified in two ways,—first by the adoption of the simpler 
bracketing of Church [1] secondly by the entire exclusion of “singular” formulae, 
i.e. those having “accidental” coincidences between the bound variables.’ The 
WFF’s are therefore rows of the symbols \, variables, and round brackets, built 
up according to the following rules: (1) x is a WFF, (2) if M is a WFF con- 
taining x as a free variable, (AxM) is a WFF, (3) if A and B are WFF’s whose 
common variables are free in both, (AB) is a WFF. (A variable is bound in 
any row of symbols in which one of its occurrences immediately succeeds a \, 
otherwise free.) 

The allowed transformations that concern us are 

I. To replace each specimen of a bound variable x in X by y, a letter not 

occurring in X. 

The result, Y, of any series of applications of I to X will be called an adjusted 
copy of X, and X conv.-I Y. 

II. To replace a part ((AvxM)N) of X by the result of substituting adjusted 

copies of N for the specimens of x in M, the new bound variables being all 

different from each other and those of X. 

It is agreed that a WFF denoted by one of the letters U, V, is of the form 
((AxM)N), and we accordingly speak of ‘‘the move U on X,” or “the move 
(X, U),” if U is a part’ of X, meaning the application of Rule II in which U is 
the part operated on. If Y is the WFF that thus replaces X, we write (X, U) -Y 
and X conv.-II Y. If a series of moves I that turns X into Y turns its part U 
into V, (Y, V) is an adjusted copy of (X, U). 

To define the residuals of a part V of X after the move ((AxM)N), suppose 
that each pair of brackets in M is provided with a numerical suffix, which is 
left unchanged in applying rule II, and that V is enclosed in the pair ( ). If 
the move ((AxM)N) turns X to Y, then 

(a) if V = ((AxM)N), V has no residual in Y; 

(b) if V is a part of N its residuals are the corresponding parts of the adjusted 

copies of N that replace x in M; 

(c) in all other cases the residual of V is the part 1( ) of Y. 

The complex = to which our general theorems will be applied has as a typical 
vertex the class [X] of all adjusted copies of a WFF X. A positive 1-cell is the 
class [(X, U)], or briefly [X, U], consisting of (X, U) and all its adjusted copies; 
and its initial and final vertices are [X] and [Y], where (X, U) -Y. If V is also 
a part of X, the [X, U]-derivate of [X, V] consists of all the cells [Y, V;], where 
the V; are the residuals of VinY. Finally “[X, U]J[X, V]’’ means that (i) neither 





8 Cf. Newman [2] §3. After the general theoretical work the calculus may be extended, 
for practical convenience, by re-admitting the singular formulae and resuming the original 
rules I and IT; and it can be shewn without difficulty that (1) every singular WFF X conv.-la 
non-singular X’, and (2) if X conv.-II Y, X conv.-I X’, Y conv.-I Y’ in the extended calculus, 
X’ and Y’ being non-singular,then X’ conv.-I-II Y’ in the restricted calculus. 

® Defined as in Church [1]. 








and 
pler 
lae, 
lhe 
uilt 
‘on- 
ose 
! in 
uA, 


not 
sted 


ted 
all 


rm 
ve 
J is 
+Y 
(U 


ose 
jis 
If 


cal 


eS} 
lso 
re 
1er 


ed, 
nal 


us, 


“COMBINATORIAL THEORY” AND “EQUIVALENCE” 241 


U nor V is a part of the other, and (ii) the free variables of U and V are the 
same. Thus J is asymmetrical relation, and is independent of the WFF chosen 
to represent [X]. 

With these definitions the conditions J; , J2 and A;—-A, are satisfied. (J,): no 
comment is necessary. (Je): let m and 72 be determined by the parts V,; and V, 
of X, and & by U, = ((AXM)N). If Vi = V2, distinct members of m | & are 
determined by different adjusted copies of V; , and evidently satisfy (i) and (ii). 
If V; and V2 are not identical they are mutually exterior, and a residual of V; 
could only be a part of a residual of V2 if V: were a part of N, and V2 a part 
of M containing x. This contravenes the condition (ii) for V; and V2 since x 
is free in M and cannot occurin N. A part of X and its residuals in Y have the 
same free variables, except that x is replaced at all occurrences by adjusted 
copies of N. Hence if Vi and V2 have the same free variables their residuals 
in Y have also. 

In considering the conditions A, let ~, , ¢ be determined by the moves U; 
on X, (¢ = 1, 2, 3), where U; = ;((Ax:M,)N,), and (X, U;) — Y;. Thus VU; is 
the part ;( ) of X. 

A; : no comment is necessary. A2: suppose that U. ~ U;. The residuals 
are clearly distinct if they are determined by the original brackets, or one by 
old and the other by new brackets. The remaining possibility is that a residual 
of U; is a part, Uz of an adjusted copy of N,, and a residual of U; is either a 
different part of the same copy, or part of a different copy,—in any case different 
from U;. As : the condition is obviously satisfied unless one of U2, Us; is part 
of the other,—say U2 of U;. If Us is in M; the residual of U2 in Y;, and of 
U; in Y. , are determined by their original brackets, and since N; contains no 
copy of x: (a bound variable of M3), the order of performance of 2( ) and 3( ) 
is indifferent. If Us is in N; the final effect is the same whether 2(_) is performed 
first on N; , followed by 3( ), or the residuals of 2( ) on the adjusted copies of 
N;in Y;. Ay: Let W be the final result of either series of moves on x: it has 
been shown to be unique to within I-adjustment, and therefore determines a 
unique vertex, w, of =. If Ui; (or U2) is not part of either of the other U;,’s, 
its performance, before or after Uz (or U;), does not affect the residual of U;. 
We may therefore suppose that one of the U,’s contains the other two. If Us; 
is not part of either N, or Ne the residual of U; in W by either route is 3( ). 
We therefore assume that U; contains both U2 and U;, and that U; is part of 
either N, or N.. Finally, if both U. and U; are in Nj , the same residuals of Us; 
are evidently obtained whether U2 is performed on N; before U;, or the corre- 
sponding moves on the copies of N, after U;. There remains orly the case 
where U2 is part of M,; , and U; of either, (a), Ni, or, (8), Nz. (a): the residuals 
of U; by either route are the corresponding parts of the adjusted copies of N; 
that replace x; in the move 1( ) on Y2. (@): if the residuals of U; in Ys are 
the parts enclosed in the brackets 3:( ), s2( ), -:* , the residuals in W by either 
route are the parts enclosed in the same brackets. 

The conditions for all the Theorems 5 to 8 are therefore satisfied, and we 
obtain the following results. 





A AHOLD. 
ae . 





242 M. H. A. NEWMAN 


Corouuary 11.1. Jf X conv.-I-I1 Y and X conv. I-II Z, there is a WFF W such 
that Y conv. I-II W and Z conv. I-II W. 

Coro.uary 11.2. There are only a finite number of different developments of a 
given set of moves II on a WFF X. All of them are finite, and all end in adjusted 
copies of the same WFF. 

Corotuary 11.3. A WFF has (apart from I-adjustments) at most one normal 
jorm, and if one exists all series of moves II terminate in this normal form or an 
adjusted copy. 


14 


Two generalizations of these theorems were given by Church and Rosser in 
their paper. The first, to the formalism extended so as to include the 6-symbol, 
is of no interest in the present connection: it is easily shown that the original 
conditions A and J are still satisfied, and hence that the Corollaries 11 hold. 
The second generalization (of which the proof was not given by Church and 
Rosser) is to the formalism in which (AxM) is counted a WFF even if x does 
not occur in M. The rules of procedure, and the definitions of derivates need 
no modification, and the conditions As—A, are proved to hold, just as before. 
The second part (“only if”) of condition A; now fails, but the condition Af is 
satisfied. The conditions J;—J3 are also satisfied if a different, more complicated, 
interpretation is given to J. 

Let “U S V” stand for ‘‘a free variable of U is bound in V.”’ It implies that 
U is a proper part of V, and if U’ and V’ are, for any W, W-residuals of U and V, 
U’ SV’ implies U SV. Let “U Ex V” stand for ‘neither U nor V is a part of 
the other.’ Then, with the same notation, U’ Ex V’ implies U Ex V. We now 
take [X, U] J [X, V], for any significant U and V, to mean 

“(i) U is not a part o1 V, and (ii) there is no part W of X such that V S W 

and U Ex W.” 

J, is clearly satisfied. 

Jo. Let the notations be those of the previous discussion of Jz, and let V; 
and Vs be distinct residuals of V; and V2. If Vi S Vi and V; Ex W’, then 
Vi S W and V2 Ex W, which is incompatible with 7 = mormJm. If Vi = V2, 
V; cannot be part of V;. If Vi # V2, the only possibility that V; be part of V: 
is that V2 be part of M, with x as a free variable, and V; be in N,—which in view 
of m J 2 contravenes (ii). 

Js. Let 7, and ¢ be determined by U, Vi, and V2, where U is ((AxM)N). 
Suppose that A n and 7J¢. Then Vj is part of N and either U is part of V2 or, 
for some W, V2 SW and U Ex W. The first alternative gives V; part of V2 ; the 
second V2 S W and V,; Ex W; and both contradict ¢ J ¢. Hence 

THEOREM 12. Corollaries 11.1, 11.2 and the first part of Corollary 11.3 hold 
in the extended calculus. 

It is easily seen that the second part of Corollary 11.3 fails to survive. 


CAMBRIDGE, ENGLAND 








such 


of a 
sted 


‘mal 
“an 


r in 
bol, 
inal 
old. 
and 
loes 
eed 
ore. 
r is 
ted, 
hat 
lV, 


t of 
10W 


J 
) 


hen 
V2, 

Ve 
iew 
N). 
the 


old 





“COMBINATORIAL THEORY” AND “EQUIVALENCE” 243 


REFERENCES 


J. W. ALEXANDER, [1] Combinatorial theory of complexes, Annals of Math., 31 (1930), pp. 
| wegen New York, 1940 
_Birxuorr, [1] Lattice theory, New York, " . . 
; CHURCH, 1] A formulation of the simple theory of types, J. of Symbolic Logic, 5 (1940), 
pp. 56-68. - . 
A. Cuurcn AND B. Rosser, [1] Some properties of conversion, Trans. Amer. Math. Soc., 
(1936), pp. 472-482. 
M. H. A. NewMan, [1] A theorem in combinatory topology, J. London Math. Soc., 6 (1931), 
pp. 186-192. 
[2] Stratified systems of logic. (Forthcoming.) 





242 M. H. A. NEWMAN 


Corouuary 11.1. If X conv.-I-I1Y and X conv. I-II Z, there is a WFF W such 
that Y conv. I-II W and Z conv. I-II W. 

Corouuary 11.2. There are only a finite number of different developments of a 
given set of moves II on a WFF X. All of them are finite, and all end in adjusted 
copies of the same WFF. 

Corouiary 11.3. A WFF has (apart from I-adjustments) at most one normal 
jorm, and if one exists all series of moves II terminate in this normal form or an 
adjusted copy. 


14 


Two generalizations of these theorems were given by Church and Rosser in 
their paper. The first, to the formalism extended so as to include the 6-symbol, 
is of no interest in the present connection: it is easily shown that the original 
conditions A and J are still satisfied, and hence that the Corollaries 11 hold. 
The second generalization (of which the proof was not given by Church and 
Rosser) is to the formalism in which (AxM) is counted a WFF even if x does 
not occur in M. The rules of procedure, and the definitions of derivates need 
no modification, and the conditions A,;—A,y are proved to hold, just as before. 
The second part (“only if”) of condition A; now fails, but the condition Af is 
satisfied. The conditions J;—J; are also satisfied if a different, more complicated, 
interpretation is given to J. 

Let “U S V” stand for “a free variable of U is bound in V.”_ It implies that 
U is a proper part of V, and if U’ and V’ are, for any W, W-residuals of U and V, 
U’ SV’ implies U SV. Let “U Ex V” stand for “neither U nor V is a part of 
the other.” Then, with the same notation, U’ Ex V’ implies U Ex V. We now 
take [X, U] J [X, V], for any significant U and V, to mean 

“(i) U is not a part of V, and (ii) there is no part W of X such that V S W 

and U Ex W.” 

Ji is clearly satisfied. 

J:. Let the notations be those of the previous discussion of Jz, and let V; 
and Vz be distinct residuals of V; and V.. If V; S Vz and V; Ex W’, then 
Vi S W and V2 Ex W, which is incompatible with 7, = m2. or mJ 2. If Vi = Ve, 
V; cannot be part of V;. If Vi ¥ Ve, the only possibility that V; be part of V: 
is that V2 be part of M, with x as a free variable, and V; be in N,—which in view 
of m J m2 contravenes (ii). 

Js. Let 4, and ¢ be determined by U, Vi, and V2, where U is ((AxM)N). 
Suppose that A n and 7 J¢. Then V1 is part of N and either U is part of V- or, 
forsome W, V2. SWand U Zx W. The first alternative gives V; part of V2 ; the 
second V2 S W and V; Ex W; and both contradict ¢ J ¢. Hence 

THEOREM 12. Corollaries 11.1, 11.2 and the first part of Corollary 11.3 hold 
in the extended calculus. 

It is easily seen that the second part of Corollary 11.3 fails to survive. 


CAMBRIDGE, ENGLAND 








ich 


ted 


nal 


ol 
1al 


Vi 
en 
¥ 
Ve 
W 


ld 


“COMBINATORIAL THEORY” AND “EQUIVALENCE” 243 


REFERENCES 


J. W. ALEXANDER, [1] Combinatorial theory of complexes, Annals of Math., 31 (1930), pp. 
292-320. 

G. Brrxuorr, [1] Lattice theory, New York, 1940. 

A. Cuurcn, [1] A formulation of the simple theory of types, J. of Symbolic Logic, 5 (1940), 
pp. 56-68. 

A. CuurcH AND B. Rosssr, [1] Some properties of conversion, Trans. Amer. Math. Soc., 39 
(1936), pp. 472-482. 

M. H. A. Newman, [1] A theorem in combinatory topology, J. London Math. Soc., 6 (1931), 
pp. 186-192. 

[2] Stratified systems of logic. (Forthcoming.) 





Pd pad | Sane 
on v. 





ANNALS OF MATHEMATICS 
Vol. 48, No. 2, April, 1942 


ISOMORPHISMS OF NORMED LINEAR SPACES' 


By Greorce W. Mackry 
(Received August 28, 1941) 


Introduction 


Following Banach [1, p. 180]’ we say that two normed linear spaces X, and 
X2 are isomorphic if there exists a one-to-one correspondence between their 
elements which is both a homeomorphism and an algebraic isomorphism; that is, 
if the spaces are abstractly identical as topological linear spaces. Let R; 
(¢ = 1, 2) be the ring of all continuous linear® transformations of X; into itself. 
Eidelheit [2] has shown that if X; and Xz are complete, then X; and X, are iso- 
morphic if and only if R; and R: are isomorphic as rings. In this paper we 
prove two analogous theorems; one involving the lattices of closed linear sub- 
spaces of X, and X» and the other their groups of automorphisms (self iso- 
morphisms). The latter theorem differs a little from the others in that from the 
isomorphism of the groups of automorphisms it is not concluded that X, and 
X» are isomorphic but only that either this is the case or X; and X2 are what we 
shall call pseudo-retiexive and mutually pseudo-conjugate. In neither theorem 
do we need to assume anything about completeness, and we use our methods 
to prove Eidelheit’s theorem without this restriction. 

In all three theorems the proof of the necessity is trivial and that of the suffi- 
ciency involves three main steps. First we use the given isomorphism between 
the associated algebraic systems to set up a one-to-one linear independence pre- 
serving correspondence between the one dimensional subspaces of Xi and X2. 
Next we show that this correspondence may be defined by a one-to-one linear 
transformation of all of X; into all of X2.. Finally we prove that the trans- 
formation is a homeomorphism. The second and third steps are accomplished 
in the same way in all three cases. We devote the first section to proving the 
two fundamental lemmas involved. The first step is accomplished by asso- 
ciating one dimensional subspaces with elements of the algebraic systems in a 
natural way and showing that the correspondence between one dimensional 
subspaces set up via the given isomorphism between the algebraic systems has 
the properties desired. This is carried out by giving algebraic characterizations 
of certain kinds of elements and sets of elements in the algebraic systems. The 
difficulty of doing this increases rapidly as we pass from lattices through rings 
to groups. Accordingly we prove the three theorems in that order in sections 
II, III, and IV. 





1 Presented to the American Mathematical Society under another title, November 22, 
1941. 

* The numbers in brackets refer to the bibliography. 

3 In this paper linear means additive and homogeneous. 


244 








and 
heir 
tis, 
R; 
self. 
iso- 
we 
ub- 
iso- 
the 
und 
we 


ods 


pen 
re= 
Ko. 


ear 
ns- 


the 





ISOMORPHISMS OF NORMED LINEAR SPACES 245 


I. Two Fundamental Lemmas 


Lemma A. If X1 and Xz are linear spaces having dimension greater than two 
and if A= A’ where A C X, and A’ C Xz represents a one-to-one correspondence 
between the one dimensional linear subspaces of X, and X» respectively which pre- 
serves linear independence, then there exists a one-to-one linear transformation T 
from all of X; into all of X2 such that if Az is the linear subspace of scalar multiples 
of x, then Ary) = A; for all x in Xy. 

Let @ be any non-zero element in X and let 7 be any non-zero element in A). 
It is clear that if 7 exists then 7(%) = Ag and may be chosen so that \ = 1. 
With this choice of 7(#), T is uniquely determined for all x in X,. In fact if 
xz = hé we must have 7(x) = Ag and if x and & are linearly indevendent then 
A! and A‘, will be linearly independent whereas A}, A;, and Aj_, will be 
linearly dependent. Hence any element in A ;, in particular g, will be a unique 
sum of elements 2; and y; from A; and A;_, respectively. Since we require 
that T(z) = T(x) + T(€ — x), we must have T(x) = 1. 

It remains to show that the T so defined is linear. It will obviously then 
have the other required properties. Let x and y be arbitrary elements of X, . 
Let M be a three dimensional subspace of X,; containing x, y, and . Let M’ 
be the three dimensional subspace of X, spanned by the one dimensional sub- 
spaces of the form A’ where A is in M. Lemma A for three dimensional 
spaces is well known. It is simply the theorem of projective geometry‘ to the 
effect that a collineation between two real projective planes can be represented 
analytically by a linear transformation [3, vol. I, p. 190 and vol. II, p. 252]. 
Thus there is a linear transformation of the desired sort taking M into M’. 
By the argument of the first paragraph, 7’ as on M must be a constant multiple 
of this transformation. Therefore if \ and u are arbitrary scalars, 7'(Ax + wy) = 
AT(x) + wT(y) and, since x and y were arbitrary, T is linear. 

Before stating and proving Lemma B we make a few preliminary remarks 
concerning the relation between the ‘maximal’ subspaces of a linear space and 
the linear functionals defined on the space. We define a maximal subspace as a 
proper subspace contained in no other proper subspace. It is clear that if z 
is any element in the complement of a maximal subspace M of a linear space X, 
then any element in X has a unique representation in the form « = m + dz 
where m is in M and d is a scalar. If we make the definition f(m + dz) = X, 
it is easily seen that f(x) is a linear functional which vanishes on M and only 
on M. Conversely, if f(x) is any non trivial linear functional defined on X, it 
is easily verified that the set of elements x of X such that f(x) = 0 is a maximal 
subspace of X. We call it the null-space of f(x). Finally, suppose that fi(x) 
and f2(x) are non trivial linear functionals having the same null-space. Let z 
be in the complement of this subspace. Then fo(z)fi(x) — filz)fo(x) is zero for 
allz in X. Therefore fo(x) = kfi(x) where k is a constant. Thus there is a 


* Thanks are due the referee for suggesting the use of this theorem to eliminate a large 
part of the author’s original proof. 













246 GEORGE W. MACKEY 


natural one-to-one correspondence between the maximal subspaces of X and the 
one dimensional subspaces of the space of all linear functionals on X. If f(x) 
is continuous as well as linear then its null space is obviously closed. (We sup- 
pose now that X is normed.) Conversely, given any closed maximal subspace 
M of X, it follows from the lemma on page 57 of [1] that there exists a non- 
trivial continuous linear functional vanishing on M and hence having M for its 
null-space. Thus every linear functional having MV as its null-space is con- 
tinuous. In other words our one-to-one correspondence associates closed sub- 
spaces with continuous functionals and vice-versa. 

Lemma B. If X; and X2 are normed linear spaces and T is a one-to-one linear 
transformation of all of X, into all of X2 such that T and T™ carry maximal closed 
subspaces into maximal closed subspaces, then T is a homeomorphism and hence 
X, and Xo are isomorphic. 

Let fo(x) be an arbitrary non-trivial continuous linear functional defined on X, . 
Let M2 be the null-space of fe and let M, = T-'(M:s). M;, is a closed maximal 
subspace of X,. Therefore there exists a continuous linear functional f,(z) 
defined on X, , having M; for its null-space. Let z be a member of the comple- 
ment of M,. By adjusting the arbitrary scalar multiplier, f; may be chosen so 
that fi(z) = 1. If xis an arbitrary member of X;, we may write x = m + fi(x)z 
where misin M,. Therefore fo(T(ax)) = fo(T(m)) + fo(fi(a)T(2)) = fo(T(z)) fila) 
since 7'(m) isin Mz. Now let {x,} be any bounded sequence of elements of X,. 
Then {fi(x,)} is a bounded sequence of real numbers. Hence since fo(7'(«,)) = 
fo(T (2))fi(an), {fo(T(an))} is a bounded sequence of real numbers. But f2 may 
be any continuous linear functional on X,. Therefore, by [1, p. 80 Théoréme 6], 
{T(2n)} is a bounded sequence of elements of X2. Thus 7 takes bounded sets 
into bounded sets. Hence using the theorem on page 54 of [1] we conclude 
that 7 is continuous. By an exactly analogous argument, 7” is continuous. 
This completes the proof. 

Lemma B essentially says that the topology of a normed linear space is deter- 
mined as soon as it is given which maximal linear subspaces are closed. This 
is closely related to a theorem of Fichtenholz [4] to the effect that the topology 
js determined by the set of continuous linear functionals. 































II. The Lattice Theorem 










THEOREM. Let X; and X2 be normed linear spaces. Let Ly be the lattice of 
closed linear subspaces of X; and Le that of X2. Then X; and Xz are isomorphic 
as normed linear spaces if and only if L; and Lz are isomorphic as lattices. 

We begin by proving some lemmas. 

Derinition. If S; and S: are subsets of a linear space we denote by S; + S: 
the smallest linear subspace containing both S; and S2, and by Si+ the smallest 
linear subspace containing S, . 

Lemma 2.1. If M is a closed linear subspace of a normed linear space X and & 
is any element of X, then M + Z is closed. 










- ns a a 








ets 


ide 


ber- 
‘his 


IBY 


- of 


hic 


Se 
lest 


ISOMORPHISMS OF NORMED LINEAR SPACES 247 


Neglecting trivial cases we suppose that Z is not in M and that M + z ¥ X. 
By the lemma on page 57 of [1], there exists a continuous linear functional f; 
on X which vanishes on all members of M and is such that f,(#) = 1. Let y 
be any element of X such that any continuous linear functional which vanishes 
on all members of M + Z also vanishes on y. If y — fi(y)% is not in M, then, 
again by the lemma on page 57 of [1], there exists a continuous linear functional 
f:, vanishing on all members of M and such that fo(y — fi(y)@) = 1. But 
fe — fo(@fi vanishes on all members of M + # and hence on y. Therefore 
fly) — fx(@fily) = fly — fily)®) = 0. Therefore y — fi(y)% is in M and y 
isin M +4. Hence if yis any element not in M + Z, there exists a continuous 
linear functional vanishing on all members of M + Z and not vanishing on y. 
In other words M + Z is an intersection of null-spaces of continuous linear func- 
tionals and hence is closed. 

As a corollary we have 

Lemma 2.2. Any finite dimensional subspace of a normed linear space is closed. 

Lemma 2.3. If L is the lattice of closed linear subspaces of a normed linear 
space and n is a positive integer, then a member M,, of L has dimension n if and 
only if there exist members of L, M1, M2, +--+ , Mn+, such that M, covers the zero 
of L and M;4: covers M; fort = 1,2,---,n — 1. 

If M has dimension n, let x, 22, --- , 2%, be a basis for MZ. By Lemma 2.2, 
at, vi + ae, +--+, ay + we + --- 4+ 2,, are all members of L, and obviously 
each covers its predecessor; except of course 21+ which covers 0. Conversely, 
since if N is finite dimensional and N’ covers N, it is obvious that the dimension 
of N’ is one greater than that of N, we conclude from the existence of such a 
chain that 17 has dimension n. 

We turn now to the proof of the theorem. If X; and Xz are isomorphic it is 
obvious that L; and Le are isomorphic. Suppose, conversely, that L; and L» 
are isomorphic. If A is any one dimensional subspace of X; , it follows from 
Lemma 2.2 that A isin L;. Let A’ be the correspondent of A in L under the 
lattice isomorphism. Since A covers 0, A’ covers 0 and hence is one dimensional. 
In this way we set up a one-to-one correspondence between the one dimensional 
subspaces of X; and X2 respectively. Let Ai, Az, ++: , A, be a linearly inde- 
pendent set of one dimensional subspaces of Xi, and let Ai P As, -°::,A, be 
their respective correspondents in X;. If Aj, A2,-°-:,A, are linearly de- 
pendent, then Ay + Az + --- + A} = M’ has dimension k <r. By Lemma 
2.2, M’ isin Lz. Let M be the correspondent of M’ in L. It is an easy conse- 
quence of Lemma 2.3 that M has the same dimension as M’. But since A; C M - 
A; CM (i = 1, 2,---,r). Therefore Ai + A, + --- + A, has dimension 
less than r. This contradicts our hypothesis that Ai, Az, --- , A, are linearly 
independent. Since the same argument applies to linearly independent sets of 
subspaces of X2, we see that our one-to-one correspondence preserves linear 
independence. Hence if neither X; nor X2 has dimension less than three we 
may apply Lemma A and conclude the existence of a one-to-one linear trans- 
formation T' of all of X into ali of X such that Ar = A; for all xin X,. Let 





La OOS 





248 GEORGE W. MACKEY 


M be any closed maximal subspace of X,. M is covered by Xi. Hence its 
correspondent in Le under the lattice isomorphism, ’, is covered by X, and 
consequently is maximal. Now z is in M if and only if Az C M, which, by 
virtue of the lattice isomorphism, is the case if and only if A, C M’. But 
A = Arq). Therefore x is in M if and only if T(x) isin M’. It follows that 
T(M) = M’ and hence that 7 takes closed maximal subspaces into closed 
maximal subspaces. Similarly, 7’ does likewise and applying Lemma B, we 
conclude that X; and X>2 are isomorphic. Suppose finally that either X, or X, 
has dimension less than three. Then since X; and X2 correspond under the 
lattice isomorphism, it follows from Lemma 2.3 that the other has this same 
dimension. But any two finite dimensional normed linear spaces having the 
same dimension are isomorphic by Lemma B, since, by Lemma 2.2, all of the 
maximal subspaces of both are closed. This completes the proof. 


III. The Ring Theorem 


DerinitIon. A continuous linear transformation E of a normed linear space X 
into itself such that E” = E will be called a projection. The set of elements of X 
of the form E(x), where x is in X, will be called the range of the projection E. The 
dimension of the range will be called the dimension of E. If E, and Ez are projec- 
tions such that the range of E; is contained in the range of E2 we shall say that E, 
is contained in E, , and if furthermore E, is not contained in Ey , we shall say that 
E, is properly contained in E,. Following the terminology used in lattice theory, 
we shall say that a projection E2 covers a projection EF, if E, is contained properly 
in E> and there exists no projection E; contained properly in Ez and properly 
containing Ey . 

Lemma 3.1. Given any finite dimensional subspace M of a normed linear space 
X, there exists a projection E whose range is M. 

Let 21, %2,°°:*,%, beabasisfor M. For eachi = 1, 2, --- , n set fi(eim + 
+++ + Cnn) = c;. Then f,(x) is a linear functional defined on M, and since 
its null-space is finite dimensional and hence closed, it is continuous. By the 
Hahn-Banach extension theorem, {1, p. 55, Théoréme 2], f; can be extended so 
as to be a continuous linear functional F defined throughout X. Let £;(x) 
F(x)x;. Since F; is continuous and linear, £; is a continuous linear transforma- 
tion of X into itself; hence so is H = FE, + FE, +---+E#,. Finally, 
E<{Ej(x)) = Fy(x,)Fj(x)x;. This is identically zero for 7 # j and identically 
E,(x) fori = j. Therefore FE’ = (Ei + EF. +--- + #,) = BE, + Eo +++: + 
E, = E. Therefore £ is a projection whose range is obviously M. 

Lemma 3.2. Given any closed maximal subspace M of a normed linear space X, 
there exists a projection E whose range is M. 

Let z be a member of the complement of M. Let f be a linear functional 
whose null-space is M and choose the arbitrary scalar multiple so that f(z) = 1. 
Set E(x) = x — f(x)z. Since M is closed, f is continuous. Therefore E(x) is 
a continuous linear transformation of X into itself. f(E(x)) = f(x) — f(x) = 9. 
Therefore E’(x) = E(E(x)) = E(x) — f(E(z))z = E(x). Thus E(z) is a projec- 








its 
nd 
by 
ut 
lat 
ed 
we 
Xo 
he 
me 
he 
he 


X 
X 
"he 
E, 
hat 
ry, 
rly 
rly 


ACE 


1ce 


1a- 





ISOMORPHISMS OF NORMED LINEAR SPACES 249 


tion. Finally, since f(E(x)) = 0, the range of E(x) is contained in M, and 
furthermore, since if x is in M, then f(x) = 0 so that E(x) = 2, we see that the 
range of E is M. 

Lemma 3.3. If E; and Es are projections defined on a normed linear space, 
then E; is contained in E2 if and only if H2E, = E,. 

If #2, = E, and z is in the range of £,, thenz = Ey. But E.E,(y) = E,(y); 
that is x = E.,(x). Therefore x is in the range of E,. Conversely, if EZ, is con- 
tained in #2, then for each x in X, Ei(x) = E.(y). Therefore E.(E\(x)) = 
Ex(y) = Ex(y) = Ex(x). That is, Exk, = Ey. 

Lema 3.4. If nis a positive integer and E,, is a projection defined on a normed 
linear space, then E,, has dimension n if and only if there exist projections E, , 
E,, +++, En such that Ej4; covers E; fort = 1, 2,---n — 1, and E, covers 0. 

The truth of this lemma follows easily from Lemma 3.1 using an argument 
similar to that used in proving Lemma 2.3. 

Lemma 3.5. If E is a projection on a normed linear space X, then the range of 
E is a closed maximal subspace of X if and only if the projection 1 covers E. 

If the range of # is maximal it is obvious that 1 covers E. Conversely, 
suppose that 1 covers E. If x = H(y), then E(x) = E(E(y)) = Ely) = «. 
Therefore x — E(x) = 0. If x — E(x) = 0, then x = E(x). In other words, 
the range of E is the null-space of the continuous transformation 1 — E and 
hence is closed. It follows from the lemma on page 57 of [1] that the range 
of E is contained in a closed maximal subspace M of X and hence by Lemma 3.2, 
there exists a projection E’ whose range is M and which contains E. Since 1 
covers E, M must also be the range of E. 

THEOREM. Let X; and X2 be normed linear spaces. Let R, be the ring of all 
continuous linear transformations of X, into itself and Rz that of X2. Then X, 
and Xz are isomorphic as normed linear spaces if and only if R, and Re are iso- 
morphic as rings. 

The necessity is obvious. The proof of the sufficiency is so much like that 
of the lattice theorem that we shall not give it in detail. Obviously, the ring 
isomorphism takes projections into projections. From Lemma 3.3 it follows 
that inclusion of projections is preserved. Combining Lemmas 3.3 and 3.4 we 
conclude that finite dimensional projections correspond to projections of the 
same dimension, and using Lemma 3.5 instead of Lemma 3.4, that projections 
with closed maximal ranges go into projections with closed maximal ranges. 
We set up a one-to-one correspondence between the one dimensional subspaces 
of X, and X2 by passing from such a subspace A in X, through a projection E 
having A for its range, to the range A’ of its correspondent HZ’ in R. That A’ 
is uniquely determined by A is an easy consequence of Lemma 3.3. We estab- 
lish the preservation of linear independence much as we did in the lattice 
theorem; using Lemma 3.1 to give us a k dimensional projection containing r 
one dimensional projections whose ranges are linearly dependent. We show 
that the 7 we get by using Lemma A is a homeomorphism using Lemma B and 
Lemma 3.5. Finally, if X; or X2 has dimension less than three, we show that 





Tc tt? Ares AE IA 




















248 GEORGE W. MACKEY 





M be any closed maximal subspace of X;. M is covered by Xi. Hence its 
correspondent in L: under the lattice isomorphism, M’, is covered by X> and 
consequently is maximal. Now z is in M if and only if Az © M, which, by 
virtue of the lattice isomorphism, is the case if and only if A: <M’. But 
Al = Anz. Therefore x is in M if and only if T(x) isin M’. It follows that 
T(M) = M’ and hence that 7 takes closed maximal subspaces into closed 
maximal subspaces. Similarly, 7”* does likewise and applying Lemma B, we 
conclude that X, and X»2 are isomorphic. Suppose finally that either X, or X, 
has dimension less than three. Then since X; and X:2 correspond under the 
lattice isomorphism, it follows from Lemma 2.3 that the other has this same 
dimension. But any two finite dimensional normed linear spaces having the 
same dimension are isomorphic by Lemma B, since, by Lemma 2.2, all of the 
maximal subspaces of both are closed. This completes the proof. 


III. The Ring Theorem 


DeFINITION. A continuous linear transformation E of a normed linear space X 
into itself such that E” = E will be called a projection. The set of elements of X 
of the form E(x), where x is in X, will be called the range of the projection E. The 
dimension of the range will be called the dimension of E. If E, and E, are projec- 
tions such that the range of E; is contained in the range of E2 we shall say that E, 
is contained in E2 , and if furthermore E» is not contained in E, , we shall say that 
E, is properly contained in E,. Following the terminology used in lattice theory, 
we shall say that a projection EF, covers a projection E, if EF; is contained properly 
in E> and there exists no projection E; contained properly in E2 and properly 
containing Ey . 

Lemma 3.1. Given any finite dimensional subspace M of a normed linear space 
X, there exists a projection E whose range is M. 

Let a1, %2,°+:,2%, beabasisfor M. For eachi = 1, 2,--- , n set fi(ciai + 
-++ + ¢€,%,) = c;. Then f,(x) is a linear functional defined on M, and since 
its null-space is finite dimensional and hence closed, it is continuous. By the 
Hahn-Banach extension theorem, [1, p. 55, Théoréme 2], f; can be extended so 
as to be a continuous linear functional F defined throughout X. Let £;(x) 
F(x)x;. Since F; is continuous and linear, E; is a continuous linear transforma- 
tion of X into itself; hence so is EH = EF, + A, +---+4#H,. Finally, 
EXE (x)) = Fy(x;)F (x)x;. This is identically zero for i # j and identically 
E;(x) fori = j. Therefore EF’ = (Ai + Fo +--- +H, =E, +A, t+-:: + 
E, = E. Therefore EF is a projection whose range is obviously M. 

LEMMA 3.2. Given any closed maximal subspace M of a normed linear space a 
there exists a projection E whose range is M. 

Let z be a member of the complement of M. Let f be a linear functional 
whose null-space is M and choose the arbitrary scalar multiple so that f(z) = 1. 
Set E(x) = x — f(x)z. Since M is closed, f is continuous. Therefore E(x) is 
a continuous linear transformation of X into itself. f(E(x)) = f(x) — f(x) = 0. 
Therefore E’(x) = E(E(x)) = E(x) — f(E(x))z = E(x). Thus E(z) is a projec- 













an 
m 








ut 
lat 


we 
Xo 
he 
me 


he 


X 
4 
"he 
ec- 


vat 
'Y; 


ISOMORPHISMS OF NORMED LINEAR SPACES 249 


tion. Finally, since f(E(x)) = 0, the range of E(x) is contained in M, and 
furthermore, since if x is in M, then f(x) = 0 so that E(x) = 2, we see that the 
range of E is M. 

Lemma 3.3. If E, and Ez are projections defined on a normed linear space, 
then E; is contained in E» if and only if E2xE, = Ey. 

If H.E, = E, and zis in the range of £,, thenz = Ey. But E.ki(y) = E,(y); 
that isa = E.(x). Therefore x is in the range of E,. Conversely, if 2; is con- 
tained in E,, then for each x in X, E,(x) = E,(y). Therefore E.(E;(x)) = 
E3(y) = E2(y) = E\(2). That is, ELE, = Ey. 

Lemma 3.4. If nis a positive integer and E,, is a projection defined on a normed 
linear space, then E,, has dimension n if and only if there exist projections E, , 
E,, +++, En such that E;4: covers E; fori = 1, 2,---n — 1, and E, covers 0. 

The truth of this lemma follows easily from Lemma 3.1 using an argument 
similar to that used in proving Lemma 2.3. 

Lema 3.5. If E is a projection on a normed linear space X, then the range of 
E is a closed maximal subspace of X if and only if the projection 1 covers E. 

If the range of E is maximal it is obvious that 1 covers E. Conversely, 
suppose that 1 covers E. If « = E(y), then E(x) = E(E(y)) = Ely) = z. 
Therefore x — E(x) = 0. If x — E(x) = 0, then x = E(x). In other words, 
the range of E is the null-space of the continuous transformation 1 — E and 
hence is closed. It follows from the lemma on page 57 of [1] that the range 
of E is contained in a closed maximal subspace M of X and hence by Lemma 3.2, 
there exists a projection E’ whose range is M and which contains Z. Since 1 
covers E, M must also be the range of E. 

THEOREM. Let X; and Xz be normed linear spaces. Let R, be the ring of all 
continuous linear transformations of X, into itself and Rz that of X2. Then X, 
and Xz are isomorphic as normed linear spaces if and only if R, and Re are iso- 
morphic as rings. 

The necessity is obvious. The proof of the sufficiency is so much like that 
of the lattice theorem that we shall not give it in detail. Obviously, the ring 
isomorphism takes projections into projections. From Lemma 3.3 it follows 
that inclusion of projections is preserved. Combining Lemmas 3.3 and 3.4 we 
conclude that finite dimensional projections correspond to projections of the 
same dimension, and using Lemma 3.5 instead of Lemma 3.4, that projections 
with closed maximal ranges go into projections with closed maximal ranges. 
We set up a one-to-one correspondence between the one dimensional subspaces 
of X; and X» by passing from such a subspace A in X, through a projection E 
having A for its range, to the range A’ of its correspondent E’ in R. That A’ 
is uniquely determined by A is an easy consequence of Lemma 3.3. We estab- 
lish the preservation of linear independence much as we did in the lattice 
theorem; using Lemma 3.1 to give us a k dimensional projection containing r 
one dimensional projections whose ranges are linearly dependent. We show 
that the 7 we get by using Lemma A is a homeomorphism using Lemma B and 
Lemma 3.5. Finally, if X; or X2 has dimension less than three, we show that 





ee ae 








250 GEORGE W. MACKEY 


X, and X, have the same finite dimension and hence are isomorphic by obsery- 
ing that the units of R, and R: must correspond under the ring isomorphism and 
hence being finite dimensional projections must have the same dimension. 

Eidelheit’s proof of this theorem is considerably shorter than ours. This is 
principally due to the fact that by using a device apparently only applicable in 
the ring situation he is able to avoid having to prove Lemma A. We give the 
longer proof here in order to emphasize the close relationship existing between 
this theorem and the other two. 


IV. The Group Theorem 


We begin by discussing the notion of pseudo-reflexivity. Let X be a normed 
linear space and let X be its conjugate space. For each x in X, as is well known, 
if we define F.(f) = f(x) for all f in X, F, is a member of X and || F, || = || « |). 
In general, there will be members of X which have no such representation. In 
the contrary case X is said to be reflexive. Even if X is not reflexive it may be 
such that a new norm may be introduded into X under which a linear functional 
is continuous if and only if it is an F,. We shall call such a space pseudo- 
reflexive. The new norm in X is not uniquely determined but by virtue of the 
theorem of Fichtenholz referred to at the end of Lemma B this is the case for 
the corresponding norm topology. The topological linear space which X be- 
comes under this topology we call the pseudo-conjugate of X. Obviously, the 
pseudo-reflexivity and the pseudo-conjugate of X depend only upon the norm 
topology in X and not upon the particular norm. Therefore we may speak of 
the pseudo-conjugate of the pseudo-conjugate of a pseudo-reflexive space X, and 
it is obvious that it always exists and is isomorphic to X. If Xi is pseudo- 
reflexive and X2 is isomorphic to the pseudo-conjugate of X; so that X2 is also 
pseudo-reflexive and X; is isomorphic to the pseudo-conjugate of X2, we say 
briefly that X, and X2 are pseudo-reflexive and mutually pseudo-conjugate. 

A few words on the relationship between reflexivity and pseudo-reflexivity. 
Clearly reflexive spaces are pseudo-reflexive. On the other hand, we can show 
without difficulty that if X is complete and pseudo-reflexive, then X is reflexive. 
In fact if X is pseudo-reflexive and complete, let F be a member of the second 
conjugate of X. Let {f,} be a sequence of members of X, bounded as a sequence 
of members of the pseudo-conjugate of X. Then {f,(x)} is a bounded sequence 
of real numbers for each x in X. Therefore, by [1, p. 80, Théoréme 5], {f,} is 
bounded as a sequence of members of the ordinary conjugate of X and hence 
{F(fn)} is a bounded sequence of real numbers. Therefore F is a continuous 
linear functional on the pseudo-conjugate of X [1, p. 54, Théoréme 1] and hence 
isan F,. Thus X is reflexive. Finally, since, as is shown in [5], the conjugate 
of any normed linear space is complete, no incomplete space is ever reflexive. 
In other words, a pseudo-reflexive space is reflexive if and only if it is complete. 
We have examples which we expect to publish later of non-complete pseudo- 
reflexive normed linear spaces. 





ther 
R 
the 
of I 
L 
line 
forn 
tion 


5 





Iso 


WwW 





ISOMORPHISMS OF NORMED LINEAR SPACES 251 


TurorEM. Let X; and Xz be normed linear spaces. Let G, be the group of 
all linear transformations of X, into all of itself which are continuous and have 
continuous inverses and let G2 be that of X2. Then G, and G2 are isomorphic as 
groups if and only if either (a) X; and X» are isomorphic as normed linear spaces 
or (b) Xi and X2 are pseudo reflexive and mutually pseudo-conjugate. 

We preface the proof of the theorem proper with a series of lemmas concerning 
what we shall call involutions.® Given a normed linear space X, a continuous 
linear transformation of X into itself such that T’ = 1 will be called an involu- 
tion. Clearly any involution on X is a member of the group G for X. If T 
is an involution on X, let M4 be the subspace of X containing all elements in 
X such that T(x) = 2, and let M_ be the subspace containing all those such 
that T(z) = —a. Let aw be any element in X. We may write x = 3(x + 
T(x) + 3(e — T(2)). But TA(e + T(2))) = (T(x) +2) = He + T(x) and 
T(3(a — T(x))) = (T(x) — x) = —}(a — T(x)). Hence z can be represented 
as the sum of an element in Mx and an element in M_. Since M, and M_ 
have nothing in common but 0, this representation is unique. Thus each in- 
volution 7’ ‘decomposes’? X into two disjoint closed subspaces in one of which 
T is 1 and in the other of which T is —1. These subspaces will be called the 
subspaces of 7’. If at least one of them is finite dimensional, the dimension of 
the one of smaller dimension will be called the dimension of 7. If neither is 
finite dimensional 7’ will be said to be infinity dimensional. 

Lemma 4.1. If X is a normed linear space, M is a finite dimensional subspace 
and M’ is any closed subspace of X such that M + M’ = X and MN M’ = 0, 
then there exists an involution T having M and M' for its subspaces. 

Let 21, %2,°**, %n be a basis for M. For each z = 1,2,---, , consider 
M=M+au+a+--- +44 + tin + --- + a,.. By Lemma 2.1, M 
is closed and since Mf M’ = 0 and the z; are linearly independent, 2; is in the 
complement of M;. Hence there exists a continuous linear functional f; which 
has M, for its null-space and is such that f;(z;) = 1. Let T(x) = 2fi(x)a. + 
2fo(a)a2 + +++ + 2fn(x)an — x. Then 7(x) is a continuous linear transforma- 
tion of X into itself such that T(2;) = 22; — 2; = 2; (¢ = 1, 2, -+-, m) and 
T(x) = —2x for all xin M’. Therefore 7°(x) = x for all z in X and T is an 
involution. Since 7 is 1 in M and —1 in M’ it follows readily that M and M’ 
are the subspaces of T’. 

Lemma 4.2. If M is a finite dimensional subspace of a normed linear space X, 
then there exists an involution T having M as one of its subspaces. 

By Lemma 3.1, there exists a projection Z whose range is M. Let M’ be 
the null space of E. Then as is readily verified M and M’ satisfy the hypotheses 
of Lemma 4.1. 

Lemma 4.3. Let M, and M_ be the subspaces of an involution T on a normed 
linear space X. Let U 4 and U_ respectively be arbitrary continuous linear trans- 

formations of Ms and M_ into themselves. Let U be the unique linear transforma- 
tion coinciding on M,. and M_ with Us and U_. Then U is continuous. 


* Cf. Sobezyk [6] page 80. 





wee at! + ge Feds 
ieee Oe ae at ere ae ; 


ie 
»-siin eed aie 


252 GEORGE W. MACKEY 


Let {z,} be a bounded sequence of elements of X. Then {z, + T(z,)} isa 
bounded sequence of elements of M, and {z, — T(xn)} is a bounded sequence 
of elements of M_. Accordingly, {U+(an + T(an))} is a bounded sequence of 
elements of M,., and {U_(a, — T(an))} is a bounded sequence of elements 
of M_ . But U lz, + T (%n)) + Ut, _ T (an)) — U(tn + T (Xn) + in — 
T(an)) = 2U(a,) (n = 1, 2,---). Therefore {U(xn)} is a bounded sequence 
of elements of X. Hence by [1, p. 54, Théoréme 1], U is continuous. 

Lemma 4.4. If T is an involution on a normed linear space X and U is an 
arbitrary linear transformation of X into itself then UT = TU 7f and only if 
U(M,) C M, and U(M_) C M_ where Mx and M_ are the subspaces of T. 

If UT = TU and zis in M_ then T(U(x)) = U(T(x)) = U(—2z) = —U(z). 
Hence U(x) is in M_. Similarly, if x is in My then T(U(x)) = U(T(x)) = 
U(x) and U(x) isin M,. Conversely, suppose U(M +) C M, and U(M_) C 
M_. If x is in X, then + = 21 + 2 where x; is in My, and z_ is in 
M_. UT(x) = U(a, — a_-) = U(ay) — U(a_). TU(x%) = T(U(e4) + 
U(a_)) = U(x,) — U(a_). Therefore UT(x) = TU (2) for all x in X. 

Lemma 4.5. If U is a linear transformation of a normed linear space X into 
itself and UT = TU for every involution T on X, then U is a constant; that is, 
there exists a scalar \ such that U(x) = da for all x in X. 

Given any x in X, by Lemma 4.2, there exists an involution T one of whose 
subspaces is x +. Since UT = TU, it follows from Lemma 4.4 that U(z) = 
rz. Let x and y be any two elements of X. U(«% + y) = Ari,(e + y) = 
het + A, yy. Hence if x and y are linearly independent, Az = Aziz, = Ay. If 
y = px, then U(y) = wU(x) = pwr.w = Ay. In any case AX, = A,. Therefore 
there exists \, independent of x, such that U(x) = Az for all x in X. 

Derinition. If A is an arbitrary set of continuous linear transformations of 
a normed linear space into itself, we shall let A* denote the set of all involutions T 
such that UT = TU for every U in A. 

Derinition. Let Ty be an involution. For each n = 1, 2, --+ choose involu- 
tions T,, T2,-+--, Tn not necessarily distinct, such that T;T; = T;T; (i,j = 
0,1,---,n). For each choice of T1, T2,--- , Tn there will be a certain number 
of elements in (T), Ti, +--+ , Tn)**. As we shall see, these numbers are finite and 
for fixed Ty and n form a bounded set. We denote by f(T, n) the largest number in 
the set for each Ty and n. 

Lemma 4.6. If To is an involution on a normed lineare space X and X is not 
finite dimensional, then To is finite dimensional if and only if supn (f(T, n)/2°”) 
is finite and if T> is finite dimensional, its dimension is equal to logs(sup: 
(f(To, n)/2°)). 

Let T,, T2, ++: , T, be such that T7;7; = T;T; (i,7 = 0,1, --- n). Denote 
the subspace on which 7’; is 1 by Mj and that on which it is —1 by Mj" (j = 9, 
1,-++n). Since TT; = T1T» it follows from Lemma 4.4 that Ti(Mo) C Mo 
and T,(Mo") © Mo’. Hence Mj = (Mo N Mi) + (Mi.N Mz’) and similarly 
for My’. Let Mj?" N MY" = Xi,i, (i; = 0,1;7 = 0,1). Then X = Xw + 
Xo. + Xw + Xu where each X;; has nothing in common with the + union of 








iS a 
nce 
» of 
nts 


nce 


——- = 
8 
— 


N Is 


nto 


ISOMORPHISMS OF NORMED LINEAR SPACES 253 


all the rest except 0 and we see that 7) and 7’ are constant in each X igi, and by 
applying Lemma 4.3 twice that any linear transformation of X into itself which 
takes each X;,:, into itself (that is, T(X;,:,) C X;,:,) and is continuous on each 
X;,:, is continuous. Since 727) = ToT: and 7:7; = TiT: , Tz takes each X;,;, 
into itself and accordingly decomposes each X;,;, into Xigi1 and Xj,i,0. 
Continuing this process we finally obtain X = Xu...1 + Xu...o + the summation 
extending over X’s having as subscripts all (n + 1)-uples of 0’s and 1’s,where 
Xiu, = Mo” Mi" .-- C MS”. Furthermore, as above, we con- 
clude that each 7; (j = 0, 1, --- m) is constant in each X;,i,...;, and that any 
linear transformation of X into itself which takes each X;,;,...;, into itself and 
is continuous on each X;,;,...;, 1s continuous. Let U be any member of (7%, 
T,,-::,T7,)*. Using Lemma 4.4, it is easy to see that U takes each X;,;,i,...:, 
into itself. Conversely, since each 7’; is constant in each X;,;,...;,, amy such 
Uisin (To, T1,---, Tn)* provided that it is an involution. In other words 
we get the general member of (7, 71, --- , T,,)* by considering an arbitrary 
involution on each of the subspaces X;,;,...:, of X and taking the unique linear 
transformation coinciding with these where they are defined. Since, in particu- 
lar, we may select the involution 1 in all but one of the X;,:,...;, and —1 in the 
one remaining, we see by Lemma 4.4 that any member of (7), 71, --- , Tn)** 
must take each X;,:,...;, into itself. Finally, since given any involution defined 
in an X;,:,...;,, there exists an involution in (7, 7:, --- , T;,)* coinciding with 
the given one where it is defined, it follows from Lemma 4.5 that any member 
of (JT, 71, --- , Tn)** must be constant on each X;,i,...:,. Conversely, if we 
assign 1’s and —1’s in an arbitrary manner to the X;,;,...:,, it is clear that there 
exists a unique linear transformation of X into itself which in each X;,;,...;, is 
1 or —1 according to the above assignment and that this transformation is a 
member of (7, 71, +--+ , T'n)**. Thus the number of members of (7) , 71, --- , 
T,)** is 2 where k is the number of non-zero X;,i,...;,._This justifies the state- 
ment made in the definition of f(7, n) and tells us that f(T, ») S 2°”-2°”. 
On the other hand if 7 is m dimensional where m is a positive integer, then 
since half of the subspaces X;,i,...;, are subspaces of an m dimensional 
space we have f(7', n) S$ 2”2°”. In other words if we make the convention that 
2" = No whenever 7 is infinity dimensional then we have in any case f(T) ,n) S 
2 min (2”,2°”). We shall now show that the equality holds. Suppose first 
that Ty is infinity dimensional or that it is m dimensional where m is an integer 
and m = 2". Then in Mo we may select 2" linearly independent elements 
M1, %2,**+, tm. By Lemma 4.2, there exists an involution defined on Ms 
having v, + a2; + --- + am for one of its subspaces. Let Mi be the other 
subspace of this involution. Similarly define Ni, y2, ys, °**, yx in My’. Let 
M; = x,4 and let N; = y:+ (¢ = 2,3,--- 2"). Then X = Mi + M4---4+ 
Mm + Ny; + No + --- + Nom where all of the subspaces concerned are closed 
and at least one dimensional and each has nothing but 0 in common with the 
+ union of the rest. Put the subspaces Mi, M2, --- Mm into one-to-one cor- 
respondence with the 2” (n + 1)-uples of 0’s and 1’s (0, a1, 72, ++ tn) and the 





lt BM 
TE em Sa ROK 


<r are 
— 
* 


ee eee 


Se Be Fo en 
~ 


ee ed 





254 GEORGE W. MACKEY 


subspaces N,, No, --- , No» with those of the form (1, a1, 2, °++%). Now 
given any j = 1, 2, --- m, consider the unique linear transformation 7’; which 
is 1 in each subspace associated with an (%, 41, -+* t) for which 7; = 0 and is 
—1 on each subspace for which 7; = 1. Since M; + M2 4+---+M» = Mf! 
and N; + Nz + --- + Nm = Mo’ and all of these subspaces except possibly 
M, and N;, are one dimensional, it follows from Lemmas 2.1, 4.1, and 4.3 that 
T; is continuous. Furthermore it is clear that T; = 1 and that 7,7; = 7,7, 
(i, j, = 0, 1, 2,--- mn). Finally we see that each X;,i,...;, for this choice of 
T:, T2, -*+ Tn is the M; or N; associated with (% , 1, +++ , %) and hence is at 
least one dimensional. In other words (7), 71, ---, 7',)** contains rr" «wn 
2.9) = 2. min (2°, 2”) members. Suppose now that 7 is m dimen- 
sional where m is an integer less than 2". We may suppose without loss of 
generality that the m dimensional subspace of 7) is the one on which 7) is 1. 
Let 21, t2, +++, tm be the elements of a basisfor Mo. Let M; = 2; + (i = 1,2, 

-,m). Let M; (¢ = m+ 1,m + 2, --- , 2") denote the 0 dimensional sub- 
space. Now we proceed exactly as before and define involutions 7; , T2, --- T, 
such that 7;7; = 7,7; (¢,j7 = 1, 2,--- , n) but such that exactly 2” + m of 
the X;,i,...;, are at least one dimensional. We are assured of being able to get 
our full quota of non-trivial N’s by our hypothesis that X is not finite dimen- 
sional. Thus in this case also the 7’; may be chosen so that (7) , T1, «++ , T'n)** 
contains 2°” -min (2°, 2”) members. Hence f(7>, ) = 2°-min (2°”, 2”). 
Now consider the behavior of f(7'>, n)/2°” as n increases. If 7% is infinity 
dimensional, then f(T, n)/2°” = min (2°, 2”) = 2°” which increases with- 
out limit. If 7) is m dimensional where m is an integer, then f(T, n)/2°” 
increases with n until n = loge m and thereafter we have f(T, n)/2°” = 2”. 
Therefore f(T, n)/2°” is bounded and has 2” for its greatest value. Thus 
m = logs (supn(f(To, n)/2°”)) and the lemma is proved. 

Lemna 4.7. If X is a normed linear space, X fails to be finite dimensional if 
and only if for each positive integer n there exist n distinct involutions T,, T2, 

+++ T,, such that T;T; = T;T; (i,j = 1, 2,--,n). If X is finite dimensional 
and k is the largest positive integer for which there exist k distinct mutually permut- 
able involutions, then m = log: k. 

The proof of this lemma is so like certain parts of the proof of Lemma 4.6 
that we omit it. 

Lemma 4.8. Let X be a non finite dimensional normed linear space. Let T; 
and T2 be one dimensional involutions which do not commute (T:T: # TT’). 
Let M; and Mz be their respective infinite dimensional subspaces and let ¢: and ¢: 
be basis elements for their one dimensional subspaces. Then if T is an involution 
on X, T is contained in (T; , T2)** if and only if one subspace of T contains M = 
M,N Mz and the other is contained in N = ¢, + &. 

We begin with a proof of the sufficiency of the condition. Let U be an arbi- 
trary member of (7; , T2)*. Let Mand M_be the subspaces of U. By Lemma 
4.4, T; takes M, into M, and M_ into M_ and so does T,. Hence each of 
these is constant in one of M, and M_ and is one dimensional in the other. 








Now 
hich 
d is 

Mo 
ibly 
that 
nT; 
e of 
S$ at 


1en- 
s of 
sl, 
1,2, 
sub- 
TT, 
n of 
get 
1en- 
a 
2”). 
nity 
ith- 
2") 
?. 
hus 


ul if 
T:, 
mal 
nut- 


Ti 
T'). 
1 or 


Hon 


rbi- 
ma 
. of 
ner. 





ISOMORPHISMS OF NORMED LINEAR SPACES 255 


Suppose that T; is constant in one of M, and M_ and that T, is constant in 
the other. Then 7:72 = T2Ti in each of M, and M_ and hence 7:7; = 727; 
contrary to hypothesis. Hence ¢; and ¢: are both contained in the same sub- 
space which we may suppose to be M,. We may suppose without loss of 
generality that git and ¢o+ are the subspaces of 7; and 7: respectively in 
which these transformations are 1. Hence N C M, and M_ CM. Let T 
be an arbitrary involution whose 1 space is in N and whose —1 space contains 
M. Then since T(x) + 2 is always in the 1 space of 7 and N is in M, we have 
U(T(x) + «) = T(x) + 2 for all x in X and this may be written in the form 
U(T(x)) = T(x) — U(x) + 2. On the other hand, since U(x) — 2 is always in 
M_ and M_ is in M, we have T(U(x) — x) = x — U(z) for all x in X and this 
takes the form 7T(U(x)) = T(x) — U(x) + x. Comparing these expressions 
we conclude that UT = TU and hence that T is a member of (7; , T:)**. To 
prove the necessity, let 7’ be an arbitrary member of (71, T2)**. Given any 
member y of the complement of N, let N’ = N + y. Since X is infinity dimen- 
sional, so is M and hence there exists 7 in M and notin N’. Let f be the unique 
linear functional defined on N’ + m such that f(¢:) = fe) = 0 and f(y) = 
f(m) = 1. Since any maximal subspace of a finite dimensional normed linear 
space is closed, any linear functional defined on one is continuous. Hence by 
the Hahn-Banach extension theorem [1, p. 55, Théoréme 2], there exists a con- 
tinuous linear functional F defined on X such that F(¢:) = F(¢@2) = Oand F(y) = 
F(m) = 1. Consider the continuous linear transformation U(x) = x«-— 2F(x)m. 
U(U(2)) = « — 2F(x)m — 2F(x)(m — 2m) = x. Therefore U is an involution. 
Furthermore for i = 1, 2, U(Ti(x)) = Ti(x) — 2F(Ti(x))m and T,(U(x)) = 
T(x) — 2F(x)T(m) = T(x) + 2F(x)m. But since 7;(x) + z is in the 1 space 
of T; and so in N, F(T (x) + x) = O and —2F(7;(x)) = 2F (zx) for all x in X. 
It follows that UT; = T,U so that U is in (71, T:)*. Hence UT = TU. In 
other words, T(x) — 2F(T(x))m = T(x) — 2F(x)T(m) or F(T(x))m = F(x)T(m) 
for allz in X. Thus 7(m) = Xm. Since m may be any element of M not in 
the (at most one dimensional) intersection of N and M, it follows by an argu- 
ment similar to that used in Lemma 4.5 that 7 is constant in M, and hence 
that M is contained in one of the subspaces of 7. There is no loss in generality 
in supposing that it is the —1 space. Thus \ = —1 and our next to the last 
equation becomes F(7(x) + x) = 0 for all xin X. In other words, the 1 space 
of T is contained in a subspace containing N and not y. But y was an arbitrary 
element of X — N. Therefore the 1 space of T is contained in N and the lemma 
is proved. 

Dermition. If T; and T2 are one dimensional involutions, we say that (T; , T2) 
is a minimal pair if it is impossible to select distinct one dimensional involutions 
T; and T, in (T,, T2)** so that (T's, Ts)** is a proper subset of (T,, T2)** and 
(7, T;)** contains an infinite number of members if (T1, T2)** does. 

Lemma 4.9. If T, and T> are one dimensional involutions defined on a non 
finite dimensional normed linear space X, then T; and T: have a common subspace 
if and only if (T, , T2) is a minimal pair. 





256 GEORGE W. MACKEY 


Let M,, M2, M, ¢:, ¢, and N be defined as in Lemma 4.8. If M, = M, 
and d+ ¥ d+, let N3 and N, be two distinct one dimensional subspaces of the 
two dimensional subspace N neither of which is M, NN. Then M, + N; = 
M, +N, =X. By Lemma 4.1, there exist involutions 7; and 7; such that M, 
is a subspace of both and N; and N, are respectively their other subspaces, 
Since N; ~ N,zand M, = M,, T3;and T,do not commute. Therefore it follows 
from Lemma 4.8, provided that 7; and T, do not commute, that 7; and 7, 
are in (T;, T2)**. Again by Lemma 4.8, since M; and M2 are both maximal 
and M, ~ M2, T-2 is not in (73, Ts)**. Finally since there are an infinite 
number of ways of choosing a onedimensional subspace of N different from 
M (.N, (T;, 7Ts)** contains an infinite number of members. Therefore if 7, 
and 7, do not commute, (71, 7:2) is not a minimal pair. If 7:7, = 7.7, 
then it follows from the argument used in Lemma 4.6 that (71, T2)** contains 
exactly eight members. If we let 7; = T; and 7, = —T;, the same sort of 
argument tells us that (73, 7s)** contains only four members. Therefore in 
any case (7, 72) is not a minimal pair. Suppose now that M; = M, = M 
and ¢i+ ~ @&+. By Lemma 4.8, if 73 and 7, are one dimensional members 
of (7, , T2)** then M; = My, = M and ¢; and qi are in N. Hence if ¢3 and g& 
are linearly independent so that 7172 ~ T2T; and ¢3 + ¢4 = N, then again by 
Lemma 4.8, (73, Ty" = (T; : T.)"*. If 3+ => dat then T => —T; and it 
follows from Lemma 4.6 that (73 , T7s)** contains only a finite number of mem- 
bers. But by means of the argument used in the first part of the proof it can 
be shown that (7; , T2)** contains an infinity of members. Hence (7, 72) isa 
minimal pair. If WM, + M:and¢,+ = ¢:+ the argument is similar. It depends 
upon the fact that if Ms; # M, and M;N M, D> M then M;/N M, = M and 
the fact that there are two linearly independent continuous linear functionals 
vanishing on M. Finally, if M, = M2 and dit = d+, then 7; = +7) and 
hence (7; , T2)** contains only 1, —1, T;, and —7,. Therefore 7, and —7; 
are the only one dimensional involutions present. Thus (71, 72) is a minimal 
pair and the proof of the lemma is complete. 

Lemna 4.10. Let X be a non finite dimensional normed linear space. Let m 
be a positive integer. Then if T is a one dimensional involution on X and T; is an 
m dimensional one, the one dimensional subspace of T is contained in the m 
(infinity) dimensional subspace of T; if and only if there exists an involution T’ 
having the same one dimensional subspace as T and such that T’T; = TT’ and 
is an m_— 1(m + 1) dimensional involution. 

Let M,, M,, M., and M,, denote the subspaces of 7 and 7;. We may 
suppose without loss of generality that M., and M,, are the —1 subspaces. If 
M, C M,,, let x1, 22, +--+ ,2%m be a basis for M,», such that 2, is in Mi. By 
Lemma 4.1, there exists an involution 7” whose 1 subspace is M, and whose —1 
subspace is M, + a. +---+2,. Both 7; and 7’ are —1 in M,, and both 
are lin M,. Ina +--+ + 2 one is —1 and the other is 1. Thus 717" = 
T’T, and is 1 on M, + M, and —1 on x + --- 4 a,» ; that is, is an m — 1 
dimensional involution. If M,; C M. the argument is similar. However, 








M, 
the 
t M, 
uCes, 
lows 
1 7, 
imal 
inite 
rom 
{ T, 
oT; , 
ains 
t of 
e in 
:- M 
bers 
d 
1 by 
d it 
em- 
can 
isa 
nds 
and 
nals 
and 
-T; 
mal 


ot Mm 
san 
2 mM 
% 
and 


nay 
If 


—1 
oth 


- | 
ver, 





ISOMORPHISMS OF NORMED LINEAR SPACES 257 


instead of a basis, we use the existence of an involution on M., having M;, for a 
subspace. Conversely, suppose that 7’ has M;, for its one dimensional sub- 
space and that T’T, = T:T’. Then, by Lemma 4.4, T’ takes M., into M, 
and M,, into M,,. Hence it is constant on one of M,, and M,, and a one dimen- 
sional involution on the other. In other words, M, is contained in either M,, 
or M., and 7’T; is obviously accordingly either an m — 1 or m + 1 dimensional 
involution. 

Lemma 4.11. Let X be a non finite dimensional normed linear space. Let m 
be a positive integer. Then af T ts a one dimensional involution on X and T; 
is an m dimensional one, the infinity dimensional subspace of T contains the m 
(infinity) dimensional subspace of T; if and only if there exists an involution T’ 
having the same infinity dimensional subspace as T and such that T’T; = T,T’ 
and is an m + 1(m — 1) dimensional involution. 

The proof is analogous to that of Lemma 4.10. 

We turn now to the proof of the theorem. If X; and X,2 are isomorphic, the 
isomorphism of G; and G2 is obvious. Suppose that X; and X2 are pseudo- 
reflexive and mutually pseudo-conjugate. Then as X2 is isomorphic to X, 
under a norm for which the elements of X, define the continuous linear func- 
tionals on X,, it will be sufficient to prove that G; is isomorphic to Gs; where G3 
is the group of automorphisms of X, under this norm. In the rest of this dis- 
cussion wherever the topology of Xi occurs it will be understood to be the one 
under which it is isomorphic to X2. Given any 7’ in G,, let T’ = (7"")* where 
T* is Banach’s conjugate [1, p. 100]. Then since, as is well known, (7172)* = 
T2T; and (7:72) = Tz'T;' it follows that (7:T2)’ = TiT:. If T’ = 1 for 
some 7’, then f(T” '(x)) = f(x) or f(T (x) — x) = O for allf in X and all xin X. 
Hence [1, p. 55, Théoréme 3] T'(x) — « = 0 whence T = 1. Next any 7” 
is continuous and hence, since (T’)’ = (7'")’, has a continuous inverse and is 
in G;. In fact if {f,} is a bounded sequence of elements of Xi, then 
{f.(7" ‘(x))} is a bounded sequence of real numbers for all x in X;. In other 
words if g. = T’(fn),n = 1, 2, --- then {gn(x)} is a bounded sequence for all x 
in X and hence by [1, p. 80, Théoréme 6], {7’(f,)} is a bounded sequence of 
elements of X,. Hence T’ is continuous [1, p. 54, Théoréme 1]. Now let U 
be any member of G;. Then if we identify each* member of x with the corre- 
sponding functional on X, and repeat the argument we have just given we find 
that (U~')* is a member of G, and as is easily verified (((U’)*)")* = U. Thus 
the set of elements of the form 7” where T is in G; is precisely Gs and T — T’ 
is an isomorphism. 

Suppose, conversely, that G, and G: are isomorphic as abstract groups. If 
either X, or X2 is finite dimensional, the isomorphism of X; and Xz is an easy 
consequence of Lemma 4.7 and the argument used in the corresponding part 
of the lattice theorem. We suppose then that neither X; nor X> is finite dimen- 
sional. Let 7, and 7, be two one dimensional involutions in G; having the same 
one dimensional subspace N and distinct infinity dimensional ones M, and M2. 
Let T; and 7; respectively be their correspondents in G.. It follows from 





ee eee 


YP LptaseF 


258 GEORGE W. MACKEY 


Lemma 4.6 that 7; and 72 are also one dimensional and from Lemma 4.9 that 
they have a common subspace. 

Case A. T and 12 have the same one dimensional subspace N’. 

Since 7; and T2 have different infinity dimensional subspaces and hence do 
not commute, it is clear that T; and 7; have different infinity dimensional sub- 
spaces. Let 73 be any other involution having N for one subspace. Since T; 
and 7; have different infinity dimensional subspaces, T; and one of T and T: 
have different infinity dimensional subspaces and hence have the same one 
dimensional subspace. In other words, 7’; has N’ as a subspace. By the same 
argument if 7” is any involution in G2 having N’ as a subspace, then 7’ has N 
as a subspace. Now let N; be any other one dimensional subspace of X. The 
unique linear functional f defined on N; + N such that f(¢) = f(¢:) = 1 where 
¢; and ¢ are non-zero members of N; and WN respectively is continuous and 
hence [1, p. 55, Théoréme 2] has a continuous linear extension. The null space 
M; of the extension is closed and maximal and contains neither N nor N,. 
By Lemma 4.1, there exists an involution 7’; having M; and N for its subspaces 
and an involution 7’; whose subspaces are M; and Ni. 13 has N’ for one sub- 
space and has a subspace in common with 7;. Since 7; does not have N fora 
subspace, 7; cannot have N’. Hence 7’; and 7; have the same infinity dimen- 
sional subspace. Let 7’; be any involution having Ni for a subspace. 1; and 7; 
have a subspace in common. If they have the same infinity dimensional sub- 
space, then 7’; and 7’; and hence 7’; and 7’; have a subspace in common. Since 
N, # N, it is their infinity dimensional subspace. Hence 7; = +7,. Hence 
in any case 7; and 7; have a common one dimensional subspace. In other 
words it has been shown that if T and U are one dimensional involutions in G; , 
then T and U have the same one dimensional subspace if and only if 7’ and U’ 
do. From this point on the argument is so much like that used in the ring 
theorem that we omit it except to say that Lemmas 3.1, 3.2, 3.3, 3.4 and 3.5 
are replaced by Lemmas 4.2, 4.1, 4.10, 4.6 and 4.1 respectively and that we con- 
clude that X; and X2 are isomorphic. 

Case B. Ti and T> have the same infinity dimensional subspace. 

It follows at once from the argument used in Case A that no pair of non- 
commuting one dimensional involutions in G with a common one dimensional 
subspace can have correspondents in G with the same one dimensional subspace. 
Hence whenever T and U have a common one dimensional subspace, 7” and U’ 
have a common infinity dimensional subspace, and if we associate with each 
one dimensional subspace N of Xi the common infinity dimensional subspace 
of the correspondents in G, of all members of G, having N as a subspace, we 
readily see that we get a one-to-one correspondence between the one dimen- 
sional subspaces of X, and the closed maximal subspaces of X2. Hence if we 
associate with each closed maximal subspace of X2, the one dimensional sub- 
space of continuous linear functionals having it for their null-space we will have 
a one-to-one correspondence between the one dimensional subspaces of X; and 
X» respectively. In order to show that this correspondence preserves linear 








that 


e do 
sub- 
eT; 
dT; 
one 
ame 
is NV 
The 
here 
and 
pace 


aces 


or a 
nen- 
1 T; 
sub- 
ince 
nce 
ther 
ti, 
1 U’ 


ring 


0n- 


\on- 
ynal 
ace. 
| U' 
ach 
ace 

we 
en- 

we 
ub- 
ave 
and 
ear 


ISOMORPHISMS OF NORMED LINEAR SPACES 259 


dependence, we note first that if M is the intersection of a finite number of 
closed maximal subspaces of a normed linear space X then there exists a finite 
dimensional subspace M such that M + M = X and MN M = 0, that while M 
is not uniquely determined its dimension is and is equal to the dimension of the 
space of linear functionals vanishing on M, and that furthermore this dimen- 
sion does not exceed the number of maximal subspaces involved. It follows at 
once that the continuous linear functionals f,, fo, ---,f, are linearly inde- 
pendent if and only if the intersection of their null-spaces contains the infinity 
dimensional subspace of an n — 1 dimensiohal involution. Let Ni,N2,-:-,Nn 
be a set of one dimensional subspaces of X;. Let M1, M2, ---, Mn respec- 
tively be their corresponding closed maximal subspaces of X2. The N; 
(i = 1, 2,--+,m) are linearly dependent if and only if there exists an n — 1 
dimensional involution in G; whose finite dimensional subspace contains all of 
the N;. Using Lemmas 4.6, 4.10 and 4.11 and involutions having the N; as 
subspaces we conclude from this that the N; are linearly dependent if and only 
if there exists an n — 1 dimensional involution in Gz whose infinity dimensional 
subspace is contained in the intersection of the M; ; that is, if and only if the 
continuous linear functionals defining the M; are linearly dependent. Let V(f) 
be the one-to-one linear transformation from all of X> into all of X; whose 
existence we can now conclude from Lemma A. Let || x || denote the norm of 
an element x of X; and set || f |!1 = || V(f) || foreach element fin X:. Obviously 
this defines a norm in X2 under which X2 is isomorphic to X,. Let F be a 
linear functional on X2 which is continuous with respect to the norm || ||, . 
Let L be the null-space of F and let M = V(L). Then M is a closed maximal 
subspace of X,. Let 7 be an involution in G; having M as a subspace. Let 
N’' be the one dimensional subspace of the correspondent of 7 in G.. Using 
Lemmas 4.6, 4.10 and 4.11 we see that a one dimensional subspace of X; is 
contained in M if and only if the corresponding closed maximal subspace of X» 
contains N’ and hence if and only if the corresponding one dimensional sub- 
space of X» is contained in the common null-space of the elements of N’, re- 
garded as linear functionals on X.. Thus L is the null-space of an element 
of N’. It follows that F(f) = f(x) for some x in N’ (and hence in X2) and all f 
in X.. Conversely, given any x in X2, let 7’ be a member of G; having ++ 
as a subspace, let 7’ be the corresponding member of G, and let M be the infinity 
dimensional subspace of T. Finally let L = V‘(M). Just as before we can 
show that that L is the null-space of x and hence since L is closed that F(f) = f(x) 
is continuous. Thus X2 is pseudo-reflexive and Xi is isomorphic to its pseudo- 
conjugate. In other words X; and X2 are pseudo-reflexive and mutually 
pseudo-conjugate and the theorem is proved. 


Concluding Remarks 


The three theorems proved in this paper may be generalized to prove that 
the system consisting of a linear space X and a total subspace L of the space of 
all linear functionals defined on X is characterized to within isomorphism by its 











260 GEORGE W. MACKEY 


lattice of L-closed subspaces, by the ring of L-continuous linear transformations 
of X into itself and, if the system L, X be identified with the system X, L, 
by its group of automorphisms. X,, L; and X2, Le are said to be isomorphic 
if there exist one-to-one linear transformations of all of X, into all of X. such 
that whenever x in X; corresponds to x’ in Xz and f in L; corresponds to f" in 
Ly then f(x) = f’(x’). An L continuous linear transformation is a linear trans- 
formation 7 such that F(x) = f(T (x)) for all « in X is in L whenever fis. An L 
closed subspace is an intersection of null spaces of linear functionals in L. If 
we let L be the set of continuous functionals for a norm, we get as special cases 
the theorems of this paper. In the author’s thesis, now in progress, the systems 
X, L are investigated systematically. 

Due principally to the fact that distinct convex topologies may have the same 
set of continuous linear functionals, the isomorphism theorems cannot be ex- 
tended, as they stand, to arbitrary convex linear topological spaces. The 
author expects to discuss the situation in detail in a later paper. 

Eidelheit [2] shows that, at least in the case of complete spaces, any iso- 
morphism between the rings R; and R, can be represented in the form 
T’ = UTU™ where T is in R,, T’ is in R, and U is a one-to-one bicontinuous 
linear transformation from all of X, into all of X.. The author expects to 
discuss the extent to which this is true in the group situation at a later date. 

Other questions suggesting themselves for investigation include the following: 
(1) What properties must an abstract group (ring, lattice) have in order to be 
the group (ring, lattice) associated with some normed linear space or more gen- 
erally some system X, LZ in accordance with the above theorems? (2) What 
special properties do the groups (rings, lattices) of complete, reflexive and other 
special kinds of normed linear space have? (3) Can we give simple characteriza- 
tions of various well known normed linear spaces in terms of properties of the 
algebraic systems associated with them? (4) Do any of the three theorems 
imply any of the others without the intervention of Lemmas A and B? The 
derivation of the ring theorem from the group theorem looks as if it might be 
fairly easy. 


HARVARD UNIVERSITY 


BIBLIOGRAPHY 


. Banacu, S., Théorie des Opérations Linéaires, 1932. 
2. Emme Herr, M., On isomorphisms of rings of linear operators, Studia Mathematica, vol. 9 

(1940), pp. 97-105. 

. VEBLEN, O. anv Youna, J., Projective Geometry. 

. Ficutennoiz, G., Sur les fonctionnelles linéaires continues au sens generalisé, Mate- 
matiche Sbornik, N.8., vol. 4 (1938), pp. 193-213. 

. Hausporrr, F., Zur Theorie der linearen metrischen Réume, Journ. f. reine u. angew. 
Math., vol. 167 (1932) pp. 294-311. 

. Sopczyk, A., Projections in Minkowski and Banach spaces, Duke Mathematical Journal, 
vol. 8 (1941), pp. 78-106. 








ons 
) L, 
hic 
uch 
Sd 
ins- 
nL 
If 
Ses 
ms 


me 
lhe 


isO- 
rm 
ous 
; to 


ng: 
be 
en- 
hat 
her 
Za- 
the 
sms 
Che 
be 


ate- 
ew. 


nal, 


AnNaLS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


NEUAUFBAU DER ENDENTHEORIE 


Von Hans FREUDENTHAL 
(Received September 2, 1941) 


Kompaktifizierungen topologischer Riume kennt man bereits aus anderen 
topologische Gebieten; z.B. ist das Bediirfnis an einem kompakten Substrat der 
geometrischen Untersuchungen wohl eine der Ursachen fiir die Erginzung der 
euklidischen zur projektiven oder zur funktionentheoretischen Ebene. Man 
sieht an diesem Beispiel, dai die Kompaktifizierung sehr verschieden ausfallen 
kann. Topologisch wird man die funktionentheoretische Kompaktifizierung als 
die natiirlichere ansehen; es liegt ja topologisch nicht der mindeste Grund vor, 
warum man gewisse divergente Folgen gegen den einen und gewisse gegen den 
andern unendlich fernen Punkt konvergieren lassen soll. Bei der unendlichen 
Geraden hingegen wird man topologisch gerade die Kompaktifizierung durch 
zwei unendlich ferne Punkte (“‘Endpunkte’’) vor der durch einen bevorzugen, 
da garnicht einzusehen ist, warum man die nach verschiedenen Richtungen 
divergierenden Punktfolgen zusammenwerfen sollte. Ebenso wird man den 
unendlichen Zylinder durch zwei ‘“Endpunkte’’ kompaktifizieren, die dreimal 
gelochte Sphare (die Badehose) durch drei Endpunkte usw. 

Von einer “natiirlichen” Kompaktifizierung wird man also verlangen: 

1. Das Unendlichferne, die Endpunktmenge, soll méglichst diinn sein (ge- 
nauer: nulldimensional). 

2. Das Unendlichferne soll méglichst weitgehend aufgespalten sein (ohne dafi 
dabei gewisse allgemeine Raum-Axiome verletzt werden). 

In meiner Dissertation’ habe ich (zum Zweck topologisch-gruppentheoretischer 
Untersuchungen) zuerst dies Kompaktifizierungsproblem behandelt und gelést 
(durch Einfiihrung der “Endpunkte’”)fiir die topologischen Riume R, die 
folgenden Bedingungen geniigen: 

a) zweites Abzaihlbarkeitsaxiom, 

b) Kompaktheit im Kleinen, 

c) Zusammenhang im Kleinen, 

d) Zusammenhang. 
In Fu8note 15 meiner Dissertation hatte ich bereits angekiindigt, dafi man die 
Bedingung c fallen lassen kann. 

Herr L. Zippin hat* andererseits gerade die Bedingung b abgeschwicht und 
ersetzt durch 

b’) Semikompaktheit,* 

‘Berlin 1931: Uber die Enden topologischer Réume und Gruppen, Math. Zeitschrift 33 
(1931), 692-713. 

* “Endpunkte” sind beilaufig in sehr speziellen Fallen von B. v. Kerékjarté (Vorlesungen 
uber Topologie, Berlin 1923, S. 164) und von L. Zippin (Transactions Amer. Math. Soc. 31 
(1929), 744-770, besonders 763) eingefiihrt worden. 

a On semicompact spaces (Amer. Journ. of Math. 57 (1935), 327-341). L. Zippin war bei 
seinen Untersuchungen teilweise unabhingig von den in Fufnote 1 zitierten. 


‘In Wirklichkeit verlangt Zippin noch Metrisierbarkeit, was aber iiberfliissig ist (siehe 
4.2 und 5.4). 





261 














262 HANS FREUDENTHAL 


d.h. die Existenz beliebig kleiner Umgebungen (zu jedem Punkte) mit kompakter 
Berandung. b’ ist in der Tat die “wahre”’ Bedingung, wenn man wiinscht, 
da® das Unendliche nulldimensional, also insbesondere jeder endliche Punkt in 
beliebig kleinen Umgebungen enthalten sein soll, die keine ,,unendlichfernen” 
Punkte auf ihrem Rande besitzen. 

In der vorliegenden Arbeit will ich das Versprechen aus meiner Dissertation 
erfiillen und die Theorie der Endpunkte aufbauen ohne die Bedingung c. Dabei 
werden ganz merkwiirdige Schwierigkeiten auftauchen, die ich in 6.4-5 iiber- 
winden werde; dort liegt also der Schwerpunkt der Arbeit. 

Es zeigt sich nun, da man auch d fast ganz fallen lassen kann; man ersetzt 
d durch 

d’) Kompaktheit des Komponentenraumes von R, 
d.h.: jede abnehmende Folge nichtleerer offener abgeschlossener Teilmengen 
von F soll einen nichtleeren Durchschnitt besitzen. 

Der Bedingung d’ geniigen alle zusammenhangenden Riume und alle Raiume 
mit nur endlich vielen Komponenten. Dagegen schlieSt d’ aus, da R aus 
unendlich vielen isolierten Komponenten besteht, d’ schlieSt allerdings nicht 
notwendig aus, dafi R unendlich viel Komponenten enthdlt, die sich nirgends 
haufen. Beispielsweise ist zulissig folgender Raum (Teilmenge der cartesischen 
Ebene), der zusammengesetzt ist aus den Komponenten 


C.:2= (n natiirlich), 


i e 


Da: x=O0n<y<n+1 (n ganz); 


hier hiufen sich nimlich die D, nirgends, und doch gilt d’, da jede offene ab- 
geschlossene Menge, die ein D, enthalt, fast alle C, , also auch alle D,, enthiilt. 

Die Bedingung d’ ist nun in der Tat unvermeidlich, wenn man einen Raum 
durch Endpunkte kompaktifizieren will. Nimmt man als R einen aus unendlich 
vielen isolierten Punkten bestehenden Raum, so bemerkt man sofort, dai die 
Aufspaltung des Unendlichen hier transfinit fortgesetzt werden mu, und dai 
man als AbschlieSung bestenfalls ein pathologisches (nicht dem 1. Abzihl- 
barkeitsaxiom geniigendes) Gebilde erhilt. 

Auch die Zippinsche Abschwichung b’ tibernehme ich. Wir sahen bereits, 
dafi auch hier keine weitere Abschwichung mee anes ist, wenn die Endpunkt- 
menge nulldimensional sein soll. 

Meine Voraussetzungen lauten also: 

a) zweites Abzihlbarkeitsaxiom, 
b’) Semikompaktheit, 
d’) Kompaktheit des Komponentenraumes. 

Man kann noch das zweite Abzihlbarkeitsaxiom fallen lassen, wenn man die 
Semikompaktheit durch die Semibikompaktheit ersetzt und kein kompaktes, 
sondern ein bikompaktes Resultat anstrebt. Sehr wichtig ist das nicht; Zufiigung 
der Endpunkte macht den Raum niamlich auch dann bikompakt, wenn d’ 





an 


ha 
fra 
pu 
tio 


zer 


Dm 3S —_= oe mm 


zer) 


wen 
ibe 








1s 
it 
ls 


je 
if 
il- 


lie 


ng 
d’ 





NEUAUFBAU DER ENDENTHEORIE 263 


garnicht gilt. Die Hauptschwierigkeit beim Beweise (siehe 6.4-5) besteht 
gerade darin, zu zeigen, das aus einem Raum mit zweitem Abzihlbarkeitsaxiom, 
bei Giltigkeit von d’, wieder ein Raum mit zweitem Abzihlbarkeitsaxiom ent- 
steht (der dann als bikompakter Raum kompakt sein mu). Trotzdem habe 
ich mich soweit méglich auf den etwas allgemeineren bikompakten Standpunkt 
gestellt. 

Dadurch da ich mich auf die unumginglichen Forderungen a, b’, d’ habe 
beschrinken kénnen, habe ich, wie mir scheint, die Endentheorie zu einem ge- 
wissen Abschlu8 gebracht. Meine Methode weicht anfangs nicht nennenswert 
von der meiner Dissertation ab (erst in 6.4-5 kommt der eigentliche Unterschied). 
Die “Endpunkte”’ werden natiirlich wieder durch absteigende Folgen von offenen 
Mengen mit kompakter Berandung definiert.° Man muf diese Folgen natiirlich 
gewissen Feinheitsbeschrankungen unterwerfen. In meiner Dissertation und 
bei Zippin geschieht das auf Grund der Forderung c. Der Zusammenhang im 
Kleinen bewirkt nimlich, da man sich auf Gebiete fiir die erzeugenden offenen 
Mengen beschrinken kann, und daf diese Gebiete bei weiterem Absteigen in 
wesentlich nur endlich viel Teilgebiete zerfallen. So flieSt aus dem Zusammen- 
hang im Kleinen einerseits der atomistische Charakter der verwendeten Folgen, 
andererseits die Kompez.ktheit des Resultats. 

Steht einem ¢ nicht zur Verfiigung, so kann man den atomistischen Charakter 
durch Maximalitaétsforderungen erzwingen. Es schien mir aber zweckmafiger 
und anschaulicher, von den erzeugenden Folgen G; > G. > --- zu verlangen, 
da fiir je zwei offene Mengen O, P mit kompakter Berandung und mit 0 C P 
gilt: 


ist einmal O n G, # ©, so ist G, C P fiir fast alle n. 


In meiner Dissertation habe ich auch ein Charakterisierungsproblem be- 
handelt. Ist R* ein Kompaktum und R iiberall dicht in R*, so kann man sich 
fragen, ob vielleicht R* gerade die Kompaktifizierung von R durch die End- 
punkte ist. Die notwendigen und hinreichenden Bedingungen meiner Disserta- 
tion lauten: 

a) R*\R ist abgeschlossen und nulldimensional, 

8) eine Gebiets-Umgebung eines Punktes von R*\R wird durch R*\R nicht 
zerlegt. 

Zippin hat @ a.a.0. abgeschwicht zu 

a’) R*\R ist ein total zusammenhangsloses F, . 

In der vorliegenden Arbeit lauten die Charakterisierungsbedingungen: 

a’) R*\R ist nulldimensional, 

8’) die Umgebungen eines Punktes p von R*\R werden durch R*\R nicht 
zerlegt in zwei in R offene Mengen, die beide p als Haiufungspunkt besitzen. 
VeranlaS8t wurde die vorliegende Arbeit durch schéne Untersuchungen des 


* Natiirlich muf man sich auf offene Mengen mit kompakter Berandung beschrinken, 
wenn man nicht zu pathologischen Resultaten gelangen will. Das ist verschiedentlich 
iibersehen worden. 





+ + BE. HE tide 
- oy * = 





264 HANS FREUDENTHAL 


Herrn J. de Groot, der sich mit folgendem Problem beschaftigt hat: Seien A 
und A’ homéomorph, A 1 B = A’n B’=oc, Wann lat sich jede hom- 
éomorphe Beziehung zwischen A und A’ erweitern zu einer homéomorphen 
Beziehung zwischen A u B und A’ u B’? 

Verlangt man, wie Herr de Groot es in der Tat tut, da A u B und A’ u B’ 
Kompakta seien, und unterwirft man die Mengen weiteren Bedingungen, die 
B baw. B’ gerade als Endpunktmenge von A bzw. A’ charakterisieren, so erz- 
wingt man in der Tat die verlangte Fortsetzbarkeit jeder Homéomorphie. 

Herr de Groot kam unmittelbar von seiner Fragestellung aus, ohne Kenntnis 
meiner und der Zippinschen Untersuchungen, zu Resultaten die sich mit den 
Zippinschen iiberschnitten. In seiner Note’ verlangt er (auSer der Uberall- 
dichtheit) : 

a’) B und B’ sind nuldimensional, 

B’’) B baw. B’ zerlegen A u B baw. A’ u B’ nicht im Kleinen. In weiteren, 
noch unpublizierten Untersuchungen schwacht er 6’’ weiter ab. Seine neuen 
Bedingungen sind gerade mit den friiher genannten a’, 8’ identisch. 

Ich bemerke aber, daf{ Herrn de Groot fiir diesen Satz (und damit auch fiir 
den friither genannten Charakterisierungssatz) in vollem Umfang die Prioritat 
zukommt, und da es erst diese sehr allgemeinen Resultate des Herrn de Groot 
waren, die mich veranlaften, die Untersuchungen meiner Dissertation wiederauf- 
zunehmen. Anfangs schien es mir zwar, als ob die gréSere Allgemeinheit der 
Resultate des Herrn de Groot durch die Andersartigkeit der Fragestellung be- 
dingt war; dann gelang es aber, den den de Grootschen Fortsetzungssatz als 
Charakterisierungssatz in eine verallgemeinerte Endentheorie einzubauen. 

Wir wenden zum Schlu& (in 8) die Begriffe an auf die gruppentheoretische 
Frage, die den Ausgangspunkt meiner Dissertation bildete. In meiner Disserta- 
tion habe ich namlich gezeigt, da ein Gruppenraum, der a-d geniigt, héchstens 
zwei Endpunkte besitzen kann. Zippin hat diesen Satz ausgedehnt auf Grup- 
penriiume, die a, b, ec’, d geniigen; er mute meine Uberlegungen dabei ziemlich 
abindern. Durch erneute, sehr weitgehende Abinderung und Vertiefung der 
Methode kann ich nun beweisen, da% ein Gruppenraum, der 

1) dem zweiten Abzihlbarkeitsaxiom geniigt, 

2) semikompakt und 

3) zusammenhingend 
ist, héchstens zwei Endpunkte besitzt. Da R kompakt ist und R*\R aus 
héchstens zwei Punkten besteht, ist R notwendig im Kleinen kompakt. Daraus 
folgt der merkwiirdige Satz: 

Ein Gruppenraum, der 1-3 geniigt, ist von selber im Kleinen kompakt. 


Bezeichnungen 


o = leere Menge. 
A u B = Vereinigung von A und B. 





6 Proc. Akad. Amsterdam 44 (1941), ————— = Indagationes Math. 8 (1941), ———— 





sel 
gal 
Ser 


len 
Me 








lie 


Lis 
en 
ll- 


en 


ur 
at 
ot 
if- 
ler 
e- 
ls 


he 


ns 
p- 
ch 
ler 


Jus 
wus 





NEUAUFBAU DER ENDENTHEORIE 265 


Ao B = Durehschnitt von A und B. 
U = Zeichen fiir die Vereinigungsbildung iiber eine Menge von Mengen. 
NM = Zeichen fiir die Durchschnittsbildung tiber eine Menge von Mengen. 
aeA: aist Element von A. 
a¢A: aist nicht Element von A. 
A\B: Gesamtheit der ae A, a¢ B. 
A = abgeschlossene Hiille von A. 

R(M) = Rand von M. 


Gotische Buchstaben verwenden wir haufig fiir Mengen, deren Elemente Mengen 
sind. 
1. Allgemeine topologische Begriffe 


1.1. R sei im Folgenden ein topologischer Raum, d.h. in R sei ein System von 
offenen Mengen gegeben, zu dem o, R, mit zwei Mengen ihr Durchschnitt, mit 
beliebig vielen ihre Vereinigung gehért. Die Komplemente der offenen Mengen 
heifen abgeschlossen. Umgebung einer Menge heifit jede sie enthaltende offene 
Menge. Der Durchschnitt aller M enthaltenden abgeschlossenen Mengen heift 
M, die abgeschlossene Hiille von M. Ein Punkt gehért zum Rand von M, R(M), 
dann und nur dann, wenn jede seiner Umgebungen Punkte von M und Punkte 
von R\M enthilt. 

1.2. Wir verwenden wiederholt folgende einfache Folgerungen: Sind zwei 
offene Mengen zueinander fremd, so ist jede zur abgeschlossenen Hiille der 
anderen fremd. Ist O offen, so gilt R(O) = O\O. 

1.3. Wir setzen im Folgenden stets die Trennungseigenschaft T voraus: Je 
zwei Punkte lassen sich durch fremde Umgebungen trennen.—Insbesondere ist 
dann jede abgeschlossene Menge der Durchschnitt ihrer Umgebungen, und ist 
jede einpunktige Menge abgeschlossen. 

1.4. Basis von R heiSt ein System, aus dem sich alle offenen Mengen durch 
Vereinigungsbildung erzeugen lassen. Das zweite Abzihlbarkeitsaxiom lautet: 
R besitzt eine abzihlbare Basis. Ein solcher Raum heift auch separabel. 
Umgebungsbasis von M hei®t ein System 8 von Umgebungen von M, wenn zu 
jeder Umgegung U von M ein V «8 mit V C U existiert. 

1.5. R hei®t reguldr, wenn zu jedem Punkte a und zu jeder Umgebung U von 
a eine Umgebung V von a existiert mit U C V. R hei®t normal, wenn das 
Entsprechende fiir jede abgeschlossene Menge gilt. Regulare separable Riume 
sind bekanntlich normal. 

1.6. Ein System a von Mengen aus R heif®t endlich gebunden, wenn der Durch- 
schnitt von je endlich viel Mengen aus a nicht leer ist. a heif&t gebunden, wenn 
ganz a einen nicht leeren Durchschnitt besitzt. 

R hei&t bikompakt, wenn jedes endlich gebundene System von abgeschlos- 
senen Mengen aus R auch gebunden ist. Bikompaktheit ist bekanntlich dquiva- 
lent mit der Uberdeckungseigenschaft: Jede Uberdeckung von R mit offenen 
Mengen enthilt eine endliche Uberdeckung. 

Ein separabler bikompakter Raum heift ein Kompaktum. 





Riicgae 








266 HANS FREUDENTHAL 


1.7. R hei&®t semibikompakt baw. Semikompaktum, wenn eine Basis K von R 
existiert, derart da® jede Menge aus K eine bikompakte Menge bzw. ein Kom- 
paktum als Rand besitzt. 

1.8. Eine Menge, die offen und abgeschlossen zugleich ist, heiSt ein Brocken, 
Das Komplement eines Brockens ist wieder ein Brocken. Ein Raum R heift 
zusammenhingend, wenn er keine Brocken au®er o und R enthialt. Kompo- 
nente von R hei®t jede maximale zusammenhingende Teilmenge. Ein Raum 
heiSt nulldimensional, wenn seine Brocken eine Basis bilden. 

1.9. Ein Kompaktum kann nur abzihlbar viel Brocken besitzen. Denn das 
zweite Abzihlbarkeitsaxiom liefert eine abzihlbare Basis fiir die Brocken; wegen 
des Uberdeckungssatzes la®t sich aber jeder Brocken bereits aus endlich vielen 
dieser Basisbrocken erzeugen. 

1.10. M heif®t iiberall dicht in R, wenn jede offene Menge von RF Punkte von 
M enthiilt. 


2. Die erzeugenden Systeme 


2.1. Wir betrachten topologische Riume R, in denen eine (zunichst unde- 
finierte) Relation zwischen offenen Mengen gegeben ist, die wir mit © bezeich- 
nen. Die Beziehung © geniigt den Bedingungen: 

1. IstO EP, so ist O C P. 

z. Ist Oy € P, Or» e P, so ist O; U O2 ]| P. 

3. IstO © P,,O © P2., soistO © Pin Pe. 

4. IstO &P, so ist R\P © R\O. 

5. Ist 0, C O2 , Oz © P,, P2 C Pi, so ist O S| P,. 
Ein System von Relationen ©, das diesen 5 Bedingungen geniigt heifit ein 
D-System (auch D(R)). Gilt 2.1.5 nicht notwendig, so heiSt es ein D’-System 
(auch D’(R)). 

Aus 2.15 folgt, da jedes nichtleere D-System die Relationen o GC O ER 
enthalt. 

Einen mit einem D-System versehenen Raum nennen wir einen D-Raum. 
Man kann jeden Raum F zu einem D-Raum machen, wenn man O © P dann 
und nur dann vorschreibt, fallsO C P (man sieht leicht, da® allen Forderungen 
geniigt ist). Man ist aber dazu nicht verpflichtet; man kann als D-System von 
R auch eine echte Teilmenge davon nehmen. 

Ein D- baw. D’-System von R erzeugt in jedem Teilraum S wieder ein D-bzw. 
D’-System. Man setze nimlich On S © Pn S in S dann und nur dann, 
wenn O & P in R gilt. Beim Ubergang zu Teilriiumen legen wir stets das 
induzierte System zugrunde. 

Man kann € anschaulich auch als einen qualitativen Abstandsbegriff 
deuten; man deute naimlich O © P als: O und R\P besitzen einen Abstand. 

2.2. Sei in R ein D’-System gegeben. Man konstruiere folgendes System D: 
Dann und nur dann ist die Relation O © P in D, wenn es in D’ eine Relation 
O, © P; gibt mitO C O; © P;}C P. Wie man leicht sieht, ist das neue System 
in der Tat ein D-System. ®D’ heift dann Basis von D und D heift von D’ 
erzeugt. 





fi 


W 


bil 


in 
Me 








as 
en 
en 


on 


le- 
h- 


in 
m 


nn 
en 
on 


Ww. 
in, 
as 





NEUAUFBAU DER ENDENTHEORIE 267 


2.3. Eine Mengeg von offenen Mengen des D-Raumes FR heift eine Erzeugende, 
wenn gilt: 
1. g ist endlich gebunden. 
2. Zu jedem G eg existiert ein G’ eg mit G’ EG. 
3. Zu je zwei offenen Mengen O, P mit O © P und mit On G # © fiir alle 
G eg existiert ein Go eg mit Go C P. 

2.4. Ist gq Erzeugende und Gy eg, G2 €g, so existiert G3 eq mit G; C Gi n Go. 
Brwets: Nach 2.3. existieren G; , Gz eg mit G, EG,. Wir setzenO = GinG:, 
P=G,0G,. Nach 2.1.3 und 2.1.5 ist O © P und nach 2.3.1 ist On G # o 
fiir alle Geg. Nach 2.3.3 existiert also ein G; eg mit G; C P = G,n Ge, 
w.z.b.w. 

2.5. Wir definieren: 

q < O: G CO fiir ein gewisses G éq. 
bh<g: b < G fir alle G eg. 
ah #0: GonH & o fir alle G eg und H eb. 

2.6. Ausg < bh folgtgh # o. Ausgh ¥ o folgtg < b (also auch h < q). 

Bewets: Erste Halfte klar—Sei nun gh ¥ o. Sei Heb. Wir bestimmen 
nach 2.3.2 ein H’ eh mit H’ © H. Nach Voraussetzung ist Gn H’ # o fir 
alle Geg. Also existiert nach 2.3.3 einGeg mitG CH. Alsog < H. Das 
gilt fir alleH eh. Alsog < 6. 

2.7. Schreibt man g = § statt gh + o (oderg < 5), so ersieht man aus 2.6, 
daf die tiblichen Rechengesetze fiir das Gleichheitszeichen gelten. 


3. Der Raum R* 


3.1. Wir beabsichtigen, die Erzeugenden von RF (unter Beriicksichtigung der 
definierten Gleichheit) als Punkte eines neuen Raumes R* zu deuten. Zuniichst 
definieren wir: 

0* = Menge allerg < O. Speziell R* = Menge aller Erzeugenden von R. 

0* © P* dann und nur dann, wenn O € P. 

3.2.1. Aus O C P folgt O* C P*. 

2. (On P)* = O* n P*. 
3. (Ou P)* DO* u P*. 

BeweIs: 3.2.1 ist klar. 3.2.2: Sei ge (O n P)*. Dann g < OP, also 
§<0O,g<P. Alsog €O*,g € P*, also g « O* n P*.—Sei umgekehrt g « O* n P*. 
Dann g ¢ O*,g ¢€ P*. Alsog < O,g < P. Also existiert G, eg, Gi C O; Ge eg, 
G, C P; also auch nach 2.4 ein G; eq mit G; C G; n G:. Dann ist aber 
G; COn P, alscog < On P, also g ¢ (O 9 P)*, w.z.b.w.—Nun 3.2.3: Sei 
geO*n P*, Danng < O oderg < P, alsog < Ou P, alsog € (O u P)*. 

3.3. R* wird zu einem topologischen Raum durch die Festsetzung: Die O* 
bilden eine Basis von R*. (Folgt aus 3.2.2.) Die Randbildung in R* heife K*. 

3.4.1. Die G* mit G < g bilden eine Umgebungsbasis von g (siehe 1.4). (Klar.) 
2. Man kann jedes g erzeugen unter ausschlieSlicher Verwendung solcher G, die 
in den Relationen der Basis D’ von D auftreten. (Man ersetze nur je zwei 
Mengen G, H eg mit G € H durch G,; und H,, wo G; © H; zu D’ gehoren und 


atthe *- 


yO he we 


Es Sea ae i 


conn cee Get ts ae eG Sa PRT 





ae 








268 HANS FREUDENTHAL 


GCG, CH, CH gelten mége.) 3. Die O*, die in den Relationen von 9 
auftreten, bilden eine Basis von R*. (Folgt aus 3.4.1-2.) 

3.5. g « O* dann und nur dann, wenn O 0 G # 0 fiir alle G eg. 

Bewets: Nach 1.2 und 3.4.1 ist g « O* aquivalent mit: G* n O* # o fiir alle 
G eg. Das ist aber nach 3.4.2 dquivalent mit: Gn O # fir alle G eg, w.z.b.w. 

3.6. AusO C P folgt O* C P*. 

BewEts: Sei g « O*. Dann nach 3.5:G 1 O # o fiir alle Geg. Also nach 
2.3.3: es existiert ein Go eg, G@ CP. Alsog < P,g € P*, w.z.b.w. 

3.7. (R\O)* = R*\O*. 

BEWEIS: 


gq « R*\O* 


ist iquivalent mit 


g¢O 
oder nach 3.5 mit der 
Existenz eines G eg mit Gn O = 0, 
und die ist aquivalent mit der 
Existenz eines G eg mit G C R\O,7 
und die ist aiquivalent mit 
g « (R\O)*, w.z.b.w. 


3.8. R* ist regular (erfillt also auch T). 

Bewets: Sei O* Umgebung von g. Nach 3.4 und 3.2.1 ist G C O fir ein 
gewisses G eg. Nach 2.3.2 existiert G’ eg mit G’ €G. Nach 2.1.1 ist G’ CG 
und nach 3.6: G’* C G*. Also auch G’* C O* und G’* ist die Umgebung von q, 
die die Regularitat gewihrleistet. 


4. Die Bikompaktheit von R* 


4.1. R hei&t D-regulér, wenn zu jedem Punkte a und jeder Umgebung V 
von a eine Umgebung U von a existiert mit U & V. 

R heift D-normal, wenn zu jeder Relation O & P eine Relation O GC Q ©P 
existiert. 

R heift D-separabel, wenn D(R) eine abzihlbare Basis D’(R) besitzt. 

4.2. Ein D-reguldres R ist auch regulér. Ein D-reguldres und D-separables R 
ist auch separabel und normal. 

Bewets: Erster Teil folgt aus 2.1.1—Zweiter Teil: Sei D’(R) die abzihlbare 
Basis von D(R). Sei $ die (abzihlbare) Menge aller P, die in Relationen 
O © P aus D’ auftreten. Sei Q irgendeine offene Menge. Wegen der D-Regu- 
laritat existiert zu jedem a e Q eine Umgebung U vona, UG Q. Gemiaf dem 
Begriff der Basis existiert in D’ eine Relation O G€ P mit U CO EGP CQ. 
Also existiert sicher ein P ¢ $ mit ae P CQ. Die Vereinigung aller solcher P 





als 
als 
als 
Te 
dre 


set 


git 








lle 


ch 


in 


1 


6B oO 


UNS B 





NEUAUFBAU DER ENDENTHEORIE 269 


(iiber alle a €Q) liefert Q. Also la&t sich jede offene Menge Q als Vereinigung 
von Mengen von $ darstellen. Da $ abzahlbar war, ist R separabel. Aus der 
Separabilitat und Regularitat folgt die Normalitat. 

4.3. g(a) sei die Menge der Umgebungen von a. 

4.4. R sei D-reguldr. Dann ist g eine topologische Abbildung von R in R*. 
Dabei ist O die Urbildmenge von O*. 

Bewets: 1. g ist eine Erzeugende: 2.3.1-2 sind evident wegen der D-Nor- 
malitit. Mégen O und P den Voraussetzungen von 2.3.3 geniigen; also O © P 
und 00 G & o fiir alleGeg. Ware a¢P, so wire nach 2.1.1 auch a¢ 0, also 
a eR\O, also R\O € g(a), aber (RO) n O = o im Widerspruch zur Voraussetzung 
iiber O. Also notwendig ae P, also P eg(a), und P ist brauchbar als das in 
2.3.3 verlangte Go . 

2. Seia = b. Dann gibt es wegen T: U € g(a), Veg(b), Un V =o. Also 
g(a) # g(b). Also ist g eineindeutig. 

3. Wir bestimmen die Urbildmenge von O*, d.h. die Menge aller a mit g(a) ¢ O* 
oder g(a) < O. Das sind aber nach Definition gerade die a aus O. Also ist O 
das Urbild von O*. Das liefert auch die Stetigkeit von g. Auch die Umkehrung 
von g ist stetig, denn g(O) = O* n g(R) ist offen in g(R). 

4.5. Wir kénnen und wollen R auch als Teilmenge von R* deuten. Wir haben 
dann auch O* n R = O. 

4.6. In einem D-regularen FR gilt: (Erginzung zu 3.2.1.) Aus O* C P* folgt 
OCP. (Folgt aus 4.5.) 

4.7. R set D-reguldr. R* wird zu einem D-Raum durch die Festsetzung: Die 
Relationen O* € P* (mit O & P) bilden eine Basis D'(R*) von R*. 

Bewets: Es geniigt, zu zeigen, dai fiir D’(R*) die Bedingungen 2.1.1-4 
erfiillt: sind: 

(2.1.1:) Sei OF & P*. Nach Definition ist dann O € P, also nach 2.1.1 
(fir D(R)) O C P, also nach 3.6: O* C P*, w.z.b.w. 

(2.1.2:) Seien Of © P*, OF € P*. Definitionsgemaf ist O; © P, O. & P, 
also nach 2.1.2 (fir D(R)) O, uO, CP. Nach 3.2.3 ist Of u OF © (O, u O»)*, 
also auch © P*, w.z.b.w. 

(2.1.3:) Analog unter Verwendung von 3.2.2. 7 

(2.1.4:) Sei O* © P*, alsoO GP. Nach 2.1.4 (fiir D(R)) ist R\P © R\O, 
also (R\P)* © (R\O)* und nach 3.7: R*\P* © R*\O*, w.z.b.w. 

4.8. R sei D-reguldr. Dann lé&t sich R auch hinsichtlich des D-Systems als 
Teilraum von R* auffassen. (Klar.) 

4.9. Satz I: R sei D-reguldr. Dann ist auch R* D-regulér. Ist R oben- 
drein D-separabel, so ist auch R* separabel und D-separabel. R* ist eine Fort- 
setzung von R, und R ist tiberall dicht in R*. 

(Folgt aus 3.4.3, 3.8, 4.2 und 4.8.) 

4.10. R sei D-reguldér und D-normal. Sei A abgeschlossen in R*,g¢A. Dann 
gibt es eine Folge Ux von Umgebungen von A mit U; > U2 > --- , derart dab 
geU;. 

Bewets: Wir wihlen V eg mit V* n A = o und Wiegmit W, © V. Auf 





PU Appa ee  O 


270 HANS FREUDENTHAL 


grund der D-Normalitat bestimmen wir induktiv ein W,4: mit W, © Wis, EV, 
Wir setzen U, = R\W,,. Dann liefert 2.1.4:Ui:5 U23 --- > R\V. Weiter 
liefert 3.2.1: Ut D (R\V)* D A. Also sind die U, in der Tat Umgebungen 
von A. Wegeng < V ist ge Uy. Also erfiillen die U, alle Wiinsche. 

4.11. Sarz Il: R sei D-reguldér und D-normal. Dann ist R* bikompakt. Ist 
R obendrein D-separabel, so ist R* ein Kompaktum. 

BEWEIs: 4 sei eine endlich gebundene Menge abgeschlossener Mengen von R*. 
Wir beweisen die Gebundenheit von a. 

Sei A ea. Zu jedemg ¢A bestimmen wir eine Folge U%;* gema® 4.10. u sei 
die Menge aller dieser offenen Mengen, wo A ganz a und g ganz R\A durchliuft. 
Wir zeigen, da u die Eigenschaften 2.3.1-2 eines g besitzt: 

(2.3.1:) U%"“* (x = 1, +++, k) seien endlich viel Mengen aus u. Es gibt ein 
bef, A,. Dann auch 6 ¢ (U%"**)*. Also existiert ein H eh mit H C U%e*s 
(x = 1,---,k). Da ZH nichtleer ist, ist auch M, U%"** nichtleer. Also ist u 
endlich gebunden. 

(2.3.2:) Sei US“ eu. Dann ist USZ, © US*. Also gilt 2.3.2. 

Wir bilden aus u alle Durchschnitte zu je endlich vielen; so entsteht ». Auch» 
erfiillt 2.3.1 (trivial) und 2.3.2 (wegen 2.3 und 2.5 folgt naimlich aus O, € P,: 
Na O, € MN P.). 

Dagegen braucht 2.3.3 nicht zu gelten. Sei nun O, P ein Paar, das 2.3.3 in 
bezug auf » verletzt. Also jedenfalls O GC P,On G # o fir alleGev. Wir 
bestimmen auf grund der D-Normalitat eine Folge P, :P: = P,O © Pas ©P, 
und adjungieren die P, zu». So entsteht ein System u’. uw’ erfiillt 2.3.1 (dav 
es erfiillte) und 2.3.2 (wegen der Wahl der P,): Der wesentliche Unterschied 
zwischen u’ und u ist aber der, dafS das Paar O, P in bezug auf u’ nicht mehr 
2.3.3 verletzt. 

Unter Verwendung einer Wohlordnung kann man u also zu einer Menge g 
erginzen, die 2.3.1-3 erfiillt. ge R*. g < alle U vonu. Alsog ealle U* mit 
Ueu. AlsogeNue,U*. Ware geMa..A, so giibe es ein Ay ea mit g ¢ Ao 
und ein Ueu mit g¢U*, und das lieferte einen Widerspruch. Also ist 
gefM4.eaA, und a ist gebunden, w.z.b.w. 

Der Rest des Satzes folgt nun aus Satz I. 

4.12. Satz III: In R* gilt unter den Voraussetzungen von Satz II: AusO C P 
folgt O & P. 

Bewets: Zu jedem g €O gibt es ein H eg, H* C P; ferner ein G eg, G EH. 
Endlich viele der zugehérigen G* iiberdecken das bikompakte 0; 2.B. 
Gi ,-:+,Ge. Wegen G, © H, und 2.1.1-2 ist G = UG, G UH, = H. Also 
G © H und O C G* € H* CP, also nach 2.5 und 4.7: 0 & P, w.z.b.w. 


5. Die Endpunkte 


5.1. Wir betrachten nun spezielle Systeme D. 
Sei & das System aller offenen Mengen von R mit bikompaktem Rand und 3 








=v. 
iter 
igen 


Ist 
R*, 


1 sei 
uft. 


ein 
OK Ay 


st u 


chv 


3 in 
Wir 


la » 
nied 
vehr 


Ze 
mit 
¢ Ao 

ist 


d 3 


.NEUAUFBAU DER ENDENTHEORIE 271 


das aller mit leerem Rand, d.h. das System aller Brocken.—Aus O €& folgt 
R\O e®. 7 

De ist so definiert: Man setzt O © P, wennO C P und OcR, Pe®. 

D3 ist so definiert: Man setzt O € O fiir alle O « 3. 

Da’ beidemal 2.1.1-4 gelten, ist leicht zu sehen. 

Dz resp. De sind die von Dz resp. De erzeugten Systeme. 

5.2. Sei R semibikompakt. Set Oe, P offen, O CP. Dann gibt es ein 
P, eX mitO CP, CP, CP. 

Brewers: Zu jedem Punkt von #(O) gibt es eine R-Umgebung U mit U C P. 
Endlich viele U1, --- , Ux, ttbherdecken R(O). P = O, u U, u -+- u U;, 
erfiillt die Forderung. 

5.3. Sei R semibikompakt. O © P gemé& Dg dann und nur dann, wenn ein 
0, & existiert mitO CO, CO, CP. 

Bewets: Nur dann: Da De Basis von Deg ist, gibt es O:, Pi eR mitO CO, © 
P, C P, also auch O CO, CO, C P. Dann: Nach 5.2 existiert P € 8 mit 
0,C P, CP. AlsoO CO, © P; CP, also O & P nach 2.1.5. 

5.4. Ein semibikompaktes R ist Dg-normal. 

Bewels: Es geniigt fiir O, Pe® aus O © P die Existenz einer Relation 
0 €Q &P abzuleiten. Die ergibt sich aber aus 5.2. 

5.5. Ein semibikompaktes R ist auch De-regular. 

Bewets: Sei U Umgebung von a. Es gibt eine Umgebung V von a, V eK. 
(R\V) C R\(a). Also gibt es nach 5.2 ein P; e® mit (R\V) C P; C R\(a). 
Dann ist W = R\P, Umgebung von a, W eK und W C U, w.z.b.w. 

5.6. Die Bezeichnung R* reservieren wir nun fiir den aus R mittels De 
erzeugten Raum (fiir den Fall, da R semibikompakt ist). Wir nennen R* 
auch die AbschlieSung von R durch seine Endpunkte. R*\R heift die Menge 
der Endpunkte. 

Es ist klar, daZ8 man bei der Erzeugung von R* an die g die zusitzliche Forde- 
rung g C8 stellen darf. Das werden wir meistens auch stillschweigend tun. 

Bei Dg sind die Voraussetzungen von Satz I-II nicht mehr erfiillt (keine 
-Regularitit!). Trotzdem bleibt vieles erhalten. Wir nennen den aus R 
mittels D3 erzeugten Raum Z(R), den Komponentenraum von R. In der Tat 
sind in einfachen Fallen die Komponenten von R Elemente von Z(£); allerdings 
zeigt das Beispiel der Einleitung, da das nicht immer gilt: die Komponenten 
D,, bilden einen Punkt von Z(R). 

Fiir die D3-Erzeugenden sind die Forderungen 2.3.1-3 besonders leicht zu 
erfiillen; sie reduzieren sich darauf, da g aus Brocken besteht und dag g endlich 
verbunden und maximal ist. 

5.7. Die Abbildung ¢(O) ordne jeder Menge den kleinsten sie enthaltenden 
Brocken zu. Fir g ¢ R* setzen wir ¢(g) = Menge der ¢(O) mit Oeg. Man 
zeigt leicht, daB R* durch ¢ stetig auf Z(R) abgebildet wird. 

Hieraus folgt: Under den Voraussetzung von Satz II ist Z(R) bikompakt. 

5.8. Sei Pe. Dann ist R*(P*) = R(P). 








reat * 2... Se Oe 





272 HANS FREUDENTHAL 


BeweEis: Wegen 3.5 ist R(P) C MR*(P*) evident. Wir brauchen also nur zu 
einem g ¢ X*(P*) ein a e R(P) anzugeben mit g(a) = g. g * Pv (R\P), also 
gilt fiir alle G eg: 


G¢ P vu (R\P) 
oder 
(1) Gna R(P) # o. 
Wir zeigen, dafi die Menge aller 
(2) GoR(P) mit Geg 
endlich gebunden ist: nach 2.4 hat man zu G, eg (kx = 1, ---,k) ein G eg mit 


k 
GcCnd, 


K=1 


also ist 


A Ga KP)) = (NG) n RP) D Ga RP) # 


wegen (1). Die Menge (2) ist also endlich gebunden, also wegen der Bikom- 
paktheit von R(P) gebunden. Sei 
ae fl (4n R(P)). 
Geg 
Dann ist erstens ae %(P) und zweitens a eG fir alle G eg, also auch a eG fiir 
alle G eg, alsog = g(a), w.z.b.w. 

5.9. Sei R semibikompakt. Dann ist R*\R nulldimensional. 

Bewets: Sei ge R*\R und O* eine Umgebung von g. Es gibt ein P «8, 
P eg, P* C O*. Anwendung von 5.8 zeigt: In R*\R besitzt die Umgebung 
P* n (R*\R) von g keinen Randpunkt, ist also ein in O* enthaltener Brocken 
rel. R*\R. Also ist R*\R nulldimensional. 

5.10. Satz IV: Sei R semibikompakt. Dann ist R*, die Abschliehung von k 
durch seine Endpunkte, bikompakt, ebenso Z(R) der Komponentenraum von R. 
R ist tiberall dicht in R*. Die Menge der Endpunkte R*\R ist nulldimensional. 
Z(R) ist nulldimensional. 

Bewets: Die Bikompaktheit von R* folgt aus Satz II (wegen 5.4-5 sind die 
Voraussetzungen erfiillt). Die von Z(R) folgt aus 5.7. Die Dimension von 
R*\R folgt aus 5.9; die von Z(R) ergibt sich trivialerweise. 


6. Die Kompaktheit von R* 


Wir setzen von & nun immer das zweite Abzahlbarkeitsaxiom und Semikom- 
paktheit voraus. 

6.1. Satz V: Z(R) ist dann und nur dann ein Kompaktum, wenn in R jede 
absteigende Folge nichtleerer Brocken einen nichtleeren Durchschnitt hat. 

Brewets: Dann: Wegen der Separabilitaét existiert eine abzihlbare Basis 








r zu 
also 


mit 


2m- 


fiir 


eR, 
Ing 
cen 
Rk 


al. 


die 


ole 


sis 


NEUAUFBAU DER ENDENTHEORIE 273 


ll C 3 von Z (der Menge der Brocken). Aus dem Mengen von U bilde man 
die Vereinigungen zu je endlich vielen; so entsteht das System Q. Sei O ein 
Brocken. O = Usn1 U, mit gewissen U,eU. Die Mengen O\U}_i U, bilden 
eine absteigende Folgen von Brocken mit leerem Durchschnitt. Nach Voraus- 
setzung ist eine von ihnen leer, also O die Vereinigung endlich vieler U,, also 
0¢«Q. Also besitzt Dg eine abzdhlbare Basis, die besteht aus allen Relationen 
Q EQmitQeQ. Also ist R Dg-separabel, also ist Z(R) nach 3.4.3 separabel 
und nach Satz IV demnach ein Kompaktum. 

Nur dann: Gabe es eine absteigende Folge M,; D> M, D--- nichtleerer 
Brocken mit leerem Durchschnitt in R, so waren N, = M,\M,,; zueinander 
fremde Brocken, und ihre Vereinigung ware der Brocken M. Sei ¢ eine Menge 
von natiirlichen Zahlen und y ihre Komplementairmenge. Dann sind die 
Mengen 

UN, und UJ QN,. 
vedo vey 
Komplemente voneinander und als Vereinigungen von offenen Mengen offen, 
also sind beide Brocken. Also ist 
(U N,)* 
ved 
fiir jedes @ ein Brocken in Z(R); es gibt also in Z(R) Kontinuum viel Brocken, 
und das widerspricht nach 1.9 der Separabilitat von Z(R). 

6.2. Sei A abgeschlossen in R und R(A) kompakt. Set Z(R) ein Kompaktum. 
Dann ist auch Z(A) ein Kompaktum. 

BrewEts: Wir wenden 6.1 an: Sei M, D Mz > --- eine Folge von nichtleeren 
Brocken rel. A. Wir zeigen, daf ihr Durchschnitt nichtleer ist: Sind alle 
M, 1 (A) # ©, so auch ihr Durchschnitt wegen der Kompaktheit von §(A), 
und dann sind wir fertig. Sei also M, 1 R(A) = o. Sei nunn 2k. M, 
ist offen in A, also nun auch offen in A\(A), also auch offen in R. Andererseits 
war I, abgeschlossen in A und A abgeschlossen in R, also auch M,, abgeschlossen 
ink. Die M, (n = k) sind also auch Brocken in R, und nach Voraussetzung 
und 6.1 ist ihr Durchschnitt nichtleer, w.z.b.w. 

6.3. Satz VI: R sei separabel und semikompakt. Dann und nur dann ist die 
AbschlieRung R* von R durch seine Endpunkte ein Kompaktum, wenn der Kom- 
ponentenraum Z(R) ein Kompaktum ist. 

Nur dann folgt aus 5.7, da Z(R) als stetiges Bild von R* notwendig auch ein 
Kompaktum sein muf. 

Um dann zu beweisen, werden wir fiir Dg eine abzihlbare Basis D” kon- 
struieren. Nach Satz I wird R* dann in der Tat separabel. Unsere Aufgabe 
ist also die folgende: Wir konstruieren ein abzihlbares System R’ C8 derart, dab 
zu je zwei Mengen O, P e8 mit O & P eine Menge Q eR’ existiert mitO EQ € P. 
Gelingt das, so sind wir in der Tat fertig; denn Dg war eine Basis von De, 
und hat man nun eine Relation O © P aus De, so ist definitionsgema% O, 
Pe; man kann also (die Lésung der “Aufgabe” vorausgesetzt) 01, Pi € 8’ 





aes 
7 - 





‘274 HANS FREUDENTHAL 


konstruieren mit O C O, © P; © P. Demnach bilden die O, © P, mit 0, 
P €&’ eine abzihlbare Basis D’’ von De . 

Die Lésung der Aufgabe geschieht in 6.4-5. Es ist bemerkenswert, dag die 
Kompaktheit des Komponentenraumes Z(f) eine wesentliche Bedingung fiir die 
Lésbarkeit ist. 

6.4.1. O; , O2, +++ mégen eine Basis von F bilden, O, € &. 

6.4.2. %, sei die Menge aller O, und R\O, mit v < n. 

6.4.3. B, sei die Menge aller Durchschnitte, gebildet aus Mengen von Y%, . 

6.4.4. ©, sei die Menge aller Brocken aller Be B,. Wegen 6.2 und der 
Voraussetzung tiber Z(R) ist auch Z(B) kompakt, also ist die Menge der Brocken 
jedes B nach 1.9 abzihlbar, also ist ©, abzahlbar. 

6.4.5. &,, bestehe aus allen Vereinigungen von je endlich viel Mengen aus 6, . 

6.4.6. ®;, bestehe aus den offenen Kernen aller Mengen‘von &, . 

6.4.7. R’ ist die Vereinigung aller &, . 

Man sieht ohneweiteres, daB R’ C & ist. 

6.5. Seien O, Pe. Gemaf der “Aufgabe” zeigen wir die Existenz eines 
QeRK’ mitO EQ EP. 

6.5.1. Wegen der Kompaktheit von #(O) gibt es unter den O,, O2, +--+ der 
Basis endlich viele, O,, , --- , O., , die R(O) so tiberdecken, daf 0,, CP. Wir 
setzen r = max (1, °-- , %). 

6.5.2. B, sei die Gesamtheit der B ¢ 8, mit Bn O ¥ o. >} ist eine Uber- 
deckung von O. 

6.5.3. Wir zerlegen %, in %; und %;’, derart da8 B eB, dann und nur dann, 
wenn Bn R(O) # o. 

6.5.4. Ist B e B,, so ist einerseits wegen (6.4.2-3) B fiir jedes »v < r Teilmenge 
eines A ¢ %,, also Teilmenge von O,, oder R\O,, fiir jedes 7. Andererseits ist 
Bn K(O) ¥ o und wegen 6.5.1 auch B n O,, ¥ o fiir ein gewisses 7. Zusam- 
men ergibt das: B CO,, C P. Also: alle Be B, liegen in P. 

6.5.5. Sei nun Be%,’. Dann ist Bn R(O) = o. Also ist B n O eine 
Brocken in B, also B n Oe G,. Wir setzen 

T= UBv Y (Ba0). 


Be Bf! Be Sil’ 


Dann ist 7 ¢&, (siehe 6.4.5); JT’ C P (siehe 6.5.4);O C T und zwar ist wegen 
6.5.1 auch noch jeder Randpunkt von O innerer Punkt von T. 

6.5.6. Q sei der offene Kern von 7. Dann ist nach 6.4.6: Qe, C &’. 
Ferner nach 6.5.5:0 €Q & P. 

Also lést &’ in der Tat die Aufgabe. 


7. Der Charakterisierungssatz 
7.1. S set ein kompaktum. R geniige den Bedingungen von Satz VI, R set 
tiberall dicht in S, S\R sei nulldimensional. Dann existiert eine stetige Abbildung 
f(R*) = S, die in R die Identitét zst. 
Brwets: Die in S genommene abgeschlossene Hiille einer Menge werde mit 
dem Exponenten S angedeutet. 








Lit 0, 


6 die 
ir die 


Y, . 
| der 
cken 


ines 


der 
Wir 


ber- 
ann, 
ange 
3 ist 
am- 


eine 


gen 


set 
ung 


mit 





NEUAUFBAU DER ENDENTHEORIE 275 


Seig «R*. Die Menge aller G* mit G ¢g heife g*®. Dag* endlich verbunden 
ist, ist sein Durchschnitt nichtleer; er mége den Punkt a ¢« S enthalten. 

V und W seien Umgebungen von a in 8; V CW;2H(V) CR; R(W) CR 
(wegen der Nulldimensionalitét von S\R kann man beliebig kleine derartige 
Umgebungen finden). Fir G eg ist aeG*, also G*n V # 0, also, da V offen 
ist, G2 V ~ eo. Nach 2.3.3 existiert Goeg mit G C W. Dann ist auch 
GS CW’. Der Durchschnitt aller Mengen von g* ist demnach in jeder Umge- 
bung W von a enthalten und besteht daher nur aus dem Punkt a, den wir auch 
f(g) nennen. f bildet nach dem Vorigen jedesg < Gp ab in W*, also f(Go) C W* 
(wobei Go von W abhangt und W beliebig klein genommen werden kann). Also 
ist die Abbildung f stetig. f(p) = p fiir p « R ist evident. 

7.2. Die Eigenschaft 6’ lautet: Ist U eine Umgebung von ae S\R in S, so 
ist es unméglich, U n R in zwei fremde, in R offene Mengen zu zerlegen, die 
beide (in S) a als Hiufungspunkt besitzen. 

7.3. Sei auber den Bedingungen von 7.1 noch 8’ erfillt. Dann ist die Abbildung 
f aus 7.1 topologisch. 

BeweEts: Wir brauchen nur die Eineindeutigkeit zu zeigen. Sei a ¢ S\R und 
A das f-Urbild von a. A ist abgeschlossen und nulldimensional. Mége A aus 
mehr als einem Punkt bestehen (wir fiihren diese Annahme zum Widerspruch). 

Ist die Umgebung (in S) V von a klein genug, so zerfallt f-'\(V) in zwei fremde 
offene Mengen C und D. Ist V’ eine Umgebung von a mit V’ C V, so zerfallt 
f'(V’) in zwei fremde offene Mengen, C’ und D’, deren abgeschlossene Hiillen 
auch noch fremd zueinander sind, und die, beide, Punkte von A enthalten. Die 
Zerlegung C’ n R, D’ n R von V’ a R widerspricht 6’, w.z.b.w. 

7.4. S = R* besitet die Eigenschaft p’. 

BewEIs: Sei Q eine Umgebung von g in R*, die gegen 8’ verstofie und O, P 
eine gegen 6’ verstoBende Zerlegung von Qn R. Sei Q’ eine Umgebung von g 
mit Q’ CQ. Dann wird Q’ n Rin O’ = Q’n O und P’ = Q’ n P z0 zerlegt, 
0’ CO, P’ CP abgeschlossene Hiille gebildet in R). g ist Hiufungspunkt von 
0’ und P’ in R*. Also gilt fiir jedes Geg: Gn O’ # 0, Gn P’ # oo. Nach 
2.3.3 gibt es G, und G. in g mit G, C O,G. C P. Aus On P = o folgt 
G, 9 G. = o im Widerspruch zu 2.3.1. Es gibt also keinen Versto® gegen 6’. 

7.5. S set ein kompaktum. R set wiberalldicht in S, und S\R sei nulldimen- 
sional und es gelte B’. Dann ist R Semikompaktum und Z(R) Kompakium. 

Beweis: Sei aeR. Die Vereinigung von a und S\R ist nulldimensional.’ 
Es gibt also beliebig kleine Umgebungen von a mit zu S\R fremder, also in R 
kompakter Berandung. 

Zum Beweise der Kompaktheit von Z(R) verwenden wir das Kriterium 6.1. 
Sei M, > M2 > --- eine Folge von verschiedenen Brocken aus R. Wir wihlen 
d,¢€M,\M,4,. Wir ziehen aus a, eine in S konvergente Teiifolge a,, mit dem 
Limes a. Wir haben 


ay, € M,,\My,41 Cc M,,.\M,, + " 


’Siehe z. B. K. Menger, Dimensionstheorie, Leipzig 1928, S. 115. 





276 HANS FREUDENTHAL 


Wir setzen a = a, ,M,, = M . Statt NM, # o kénnen wir auch NM’, x . 
beweisen. Wire das falsch so kénnte a nicht in R sein. a wire dann nach 5,8 
innerer Punkt von My und U = My wire Umgebung von a. Dann lieferten 


V= U (M on\M on41); W= U (Mena\M on) 
eine 8’ widersprechende Zerlegung von U n R. 

7.6. Satz VII: S sei ein Kompaktum. R sei tiberalidichtin S. Dann und nur 
dann ist S wesentlich die Kompaktifizierung von R durch seine Endpunkte, wenn 
gilt: 

a’) S\R ist nulldimensional. 

8’) Die Umgebungen eines Punktes p « S\R werden durch S\R nicht zerlegt in 
zwei in R offene Mengen, die beide in S den Héufungspunkt p besitzen. 

Bewets: Dann folgt aus 7.3, wobei 7.5 rechtfertigt, da wir unter den 
Voraussetzungen iiber R die Semikompaktheit von R und die Kompaktheit von 
Z(R) weggelassen haben. Nur dann folgt aus Satz VI und aus 7.4. 


8. Die Endpunkte von Gruppen 

8.1. R geniige im Folgenden dem zweiten Abzahlbarkeits axiom, sei semi- 
kompakt, zusammenhdngend und eine topologische Gruppe (die letzte Forderung 
besagt, dai R eine Gruppe sei, in der a-b eine stetige Funktion von a und b sei 
und a” eine stetige Funktion von a); die Identitit hei®e e. 

8.2. Die Links- (oder Rechts-) Multiplikationen mit einem festen Element a 
sind topologische Abbildungen von R auf sich; nach Satz VII lassen sie sich 
topologisch bis in die Endpunkte festsetzen. ag ist also sinnvoll als lim ac,, 
wenn lim c, = g; und zwar ist ag wieder ein Endpunkt. 

8.3. Seien 0 eR, PeR,O CP. Da R(O) kompakt und in P ist, so gibt es 
eine Umgebung U von e mit 
(1) R(aO) = aR(O) C P fiir alleae U. 

Sei 

N = a0 9 (R\P); 
R(N) C R(aO) v R(R\P), 
also wegen (1) 
(2) R(N) CRP). 

Seien 

(3) N, = a,0 9 (R\P), 


(4) lim a, = e. 


Waren alle N, nichtleer, so wiren wegen des zusammenhangs von R auch alle 
R(N,) wichtleer und es giibe 


(5) ce R(N,) C R(P) (nach (2)). 





Xo 
h 5.8 


l nur 
venn 
gt in 


den 
von 


emi- 
rung 
b sei 
nt a 
sich 


rt es 


alle 








NEUAUFBAU DER ENDENTHEORIE aid 


Dann nach (3) ¢, €@,0, also 
(6) c, = a,b, mit by € 0. 


Die c, besitzen nach (5) einen Haufungspunkt c in 9(P), also ist wegen (4) c 
auch Haufungspunkt der b, und nach (6) in O, und das widerspricht der Voraus- 
setzung O C P. 

Also ist die Annahme falsch, und es gibt in jeder Folge a, mit lim a, = e 
unendlich viel »y mit leerem N,, also mit 4,0 C P. Also gibt es auch eine 
Umgebung V von e mit 

aO C P fir alleae V. 


Sei der Endpunkt g definiert durch die Folge G, mit G,,, C G, 2 Dann gibt 
es nach dem Vorangehenden Umgebungen V, von e mit aGn4, C G, fiir alle 
aeV,. Es ist also ag < G, firaeV,. 

Sei a, eine e-Folge. Zu jedem n gibt es ein k, mit a,¢« V, fiirv 2 k,. Also 
aq < G,firy 2 k,. Hier aus folgt 


lim ag = (lim a,)g 


(erst fiir e-Folgen und dann allgemein fiir konvergente Folgen). 

8.4. Die Rechtsmultiplikation mit g bildet also R stetig ab in R*\R. Da R 
zusammenhangend, R*\R aber nulldimensional ist, mu die Bildmenge ein 
Punkt sein, und da eg = g ist dieser Punktg. Also 


ag = g. 


8.5. Daher ist fiir lim c, = g und Ge g auch ac, ¢ G fiir fast allev. Dann gilt 
aber fiir jedes kompakte M CR 


auch Mc C G fiir fast alle v. 


8.6. Wir fassen die Ergebnisse zusammen in 

Satz VIII: Die Links- (oder Rechts-) Multiplikationen mit Elementen von R 
sind bis in die Endpunkte hinein topologische Abbildungen. Die Endpunkte sind 
Fixpunkte dieser Abbildungen. Jede kompakte Menge aus R kann durch Rechts- 
(und Links-) Multiplikation in jede Umgebung jedes Endpunktes gezogen werden. 

Wir beweisen nun 

Satz IX: Hine zusammenhdngende, semikompakte, dem zweiten Abzdhlbar- 
keitsaxiom geniigende Gruppe besitet héchstens zwei Endpunkte, ist also im Kleinen 
kompakt. 

8.7. Bewets: Seien f, g, § drei verschiedene Endpunkte (wir fihren diese 
Annahme zum Widerspruch). 

Wir nehmen F ef, Geg, G’ eg, Heb, so da® F,G, H paarweise fremd sind 
und G’ C Gist. Nach Satz V existiert c mit 


R(Fe) = R(F)e € @’. 


* Wir nehmen im Folgenden die einen Endpunkt erzeugenden offenenen Mengen immer 
als zu gehorig an. 





278 HANS FREUDENTHAL 


Setzen wir 
N = Fen (R\G’), 
so ist (wie in 8.3(2)) . 
RV) CRG"), 

also ist N ein Brocken in R\G’; f < N, h X N wegen f < Fe, f < R\G,5 x Fe 
(Invarianz von f und § bei Multiplikation mit c gemaf® Satz VIII). Bildet 
man O = N n (R\G), so sieht man: Es gibt einen Brocken O in R\G mit f < 0, 
h x O. Analog findet man einen Brocken P in R\G mit 6 < P,f * P. Man 
darf O n P = o annehmen (evtl. lasse man aus beiden O n P weg). 

Der Durchschnitt aller Brocken von R\G, die f bzw. 6 enthalten, hei®e A, 


bzw. Bz. Da R zusammenhingend ist, ist Z(R) kompakt, also auch Z(R\G) 
kompakt (siehe 6.2), also wegen 6.1 


Ag #9, Be # 9, 
ferner 
(1) Ag n Bg = °. 
(2) Fir G’ CG ist Ag DAc, Be D Be. 
Sei G: D G, D --- eine g definierende Folge; wir setzen 


i] 


A = U Ag,, B = U Bg, - 


n=l n=1 


Wegen (1) und (2) ist 
(3) AnB=o, 
Aus (2) schlieSt man ohne weiteres, da; A und B unabhingig von der Wahl 
der Folge G, €g sind, und hieraus folgt 
cAg = Ace, cBg = Bee, 
also 
cA = A, cB = B. 
Das ergibt aber fiir 
c = ba” 


einen Widerspruch zu (3), w.z.b.w. 

8.8. Satz X: Besitet eine zusammenhdngende, semikompakte, dem zweiten Ab- 
zdihlbarkeitsaxiom geniigende Gruppe zwei Endpunkte, so sind die Endpunkte 
zueinander invers (d.h. ist lim c, ein Endpunkt, so ist lim c>* der andere. Jede 
abgeschlossene Untergruppe, die nicht kompakt ist, besitet dann auch zwei End- 
punkte. 








. Fe 
Idet 


/ 
VV) 


Vian 


\G) 


Vahl 


nkte 
Jede 
ynd- 





NEUAUFBAU DER ENDENTHEORIE 279 


Bewets: f und g mégen die Endpunkte sein. g werde durch die Folge 
G, D G, D --+ definiert, f * G,. Ag werde wie oben (8.7) als Durchschnitt 
der f enthaltenden Brocken definiert. A = U%_1 Ag, ist wieder von der Wahl 
der G, unabhangig, also cA = A fiir alle c, also A = R, alsoe eA. 

Wir wahlen k so, daf 


(1) e€ Ag, 

ist. Dann ist 

(2) ee R\G,. 

Nach Satz V gibt es zu dem kompakten (G;) ein c, eG, mit R(c,G)) = 
enk(G;) C G,. Dann ist der Rand von 

(3) Nn = ¢nGi 9 (R\Gr) 


ganz in R(G;), also ist N, ein Brocken in R\G; ; wegen f < G, und der In- 
varianz der Endpunkte ist f * cnGi, X N,, also mu N, fremd zu Ag, sein, 
also wegen (1) 


e¢N,. 
Hieraus zusammen mit (2) und (3) folgt 

€ ¢CnG 
oder 
(4) ce. 


Der Ubergang zur Inversen ist eine topologische Selbstabbildung von R, die 
sich nach Satz VIII topologisch bis in die Endpunkte fortsetzen la8t. lim c, 
existiert also und ist ein Endpunkt, und nach (4) kann das nicht der Endpunkt 
gsein. Hieraus folgt dieselbe Aussage fiir jede Folge c, mit lim c, = g, also die 
erste Halfte des Satzes. 

Ist Q eine abgeschlossene Untergruppe und erzeugen die Folgen G, und G, 
die Endpunkte von R, so sind die Folgen Q n G, und Q n G,’ entweder beide 
leer (und dann ist Q kompakt) oder beide nichtleer (und dann besitzt Q ebenso 
wie R zwei Endpunkte), und damit ist die zweite Halfte des Satzes bewiesen. 


AMSTERDAM. 





a 
anes 
\: 















ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


ON THE CONTINUATION OF A RIEMANN SURFACE 


By Maurice H. HEIns 
(Received November 14, 1941) 


1. Introduction. 


Let F denote a Riemann surface in the sense of Weyl-Rad6 [15, 17]. If there 
exists another Riemann surface G such that F admits a (1, 1) directly conformal 
map onto a proper part, F’, of G, then F is said to be continuable and G is said 
to be a continuation of F; otherwise, F is said to be non-continuable or maximal. 
The closed Riemann surfaces are maximal. Radé [14] has shown by example 
that there exist open maximal Riemann surfaces. We remark that every open 
topological surface admits a topological continuation. Later Bochner [1] estab- 
lished by appeal to the well-ordering hypothesis and transfinite induction that 
every continuable Riemann surface admits a maximal continuation. Shortly 
thereafter, the question of characterizing the continuable Riemann surfaces was 
considered by de Possel [10, 11], first, in two notes which are fragmentary in 
character and do not contain any indication of proofs, and later, in his thesis 
[12], where he gives a characterization of continuable Riemann surfaces in terms 
of “sets of maximal type” [12, p. 4] and the topological structure of the surface. 

We shall consider the family ® consisting of Riemann surfaces, G, which are 
continuations of a given continuable Riemann surface, F’, and of F itself. Let 
it be assumed that F does not admit the Riemann sphere or a closed Riemann sur- 
face of genus one as a continuation save when the contrary is mentioned. Under 
these hypotheses, it will be shown that an explicitly given subset, &o , of ® may 
be defined in a natural manner to be an &-space in the sense of Fréchet [3] and 
that so defined &) is compact. With the aid of this result, the theorem of Boch- 
ner may be established without appeal to transfinite induction. Problems con- 
cerning the exhibition of a maximal continuation of a given continuable Riemann 
surface, and the existence of a maximal continuation with specified properties 
to be stated in the course of the present paper find an appropriate setting in 
the study of the structure of and ). These questions have not been treated 
hitherto. 


2. Riemann surfaces, Fuchsian and Fuchsoid groups 


We recall that a Riemann surface in the sense of Weyl-Radé may be defined 
as a 2-dimensional manifold F such that to each neighborhood U(wo) of a point 
Wy) on the manifold there are associated biuniform and bicontinuous transforma- 
tions Tv~w,) of U onto simply-connected regions of the complex plane with the 
property that, if U; and U2 are neighborhoods of points w; and w» of F respec- 
tively, and if they have a non-vacuous intersection, then ry,77, defines a (1, 1) 
directly conformal transformation of ry,(U1-U2) onto ty,(Ui-Us) [8, 15, 17]. 

By virtue of the restriction which we have placed on F—that it admit neither 
280 
























here 
rmal 
said 
mal. 
nple 
pen 
tab- 
that 
yrtly 
was 
y in 
1€sis 
rms 
‘ace. 
are 
Let 
sur- 
nder 
may 
and 
och- 
con- 
ann 
rties 
g in 
ated 


ined 
oint 
ma- 

the 
pec- 
i oe 


ther 


ON CONTINUATION OF A RIEMANN SURFACE 281 


the Riemann sphere nor a closed Riemann surface of genus one as a continuation 
_F is not simply-connected. F, the wniversal covering surface of F, may be de- 
fined by its covering relation to F as a Riemann surface. When so defined, P 
is conformally equivalent to the interior of the unit circle in the complex plane. 
Thus, being given any point wo of F, we may assert the existence’ of a single- 
valued map 


(2.1) w = w(z, Wo) 


of |z| < 1 onto F with the following properties: 

1° w(0, wo) = Wo ; 

2° to each ¢ with | ¢| < 1 and U(w) where w = w({, wo) there corresponds a 
neighborhood of ¢, N(¢), such that ry~)[w(z, wo)] is analytic and univalent in 
N(S); 

3° every point w of F has an antecedent with respect to (2.1) in |z| < 1; 

4° any local determination of the inverse of w(z, wo), 2(w), in U(w) for which 
¢ = 2(w, Wo) can be continued along any analytic are lying in F whose initial 
point is w. 

Since F is not simply-connected, w(z, wo) is not univalent and hence by 2’, 
3°, and 4° it is automorphic with respect to a group G(F) of linear fractional 
transformations z | 7'’z which map | z| < 1 onto itself. By 2° the transforma- 
tions 7’ are either hyperbolic or parabolic, but never elliptic. To avoid circum- 
locutions, we recall that a group © consisting of linear fractional transforma- 
tions mapping | z | < 1 onto itself, which is properly discontinuous for | z| < 1, 
is called Fuchsian if it has a finite number of generators, otherwise Fuchsoid [2]. 
Since we shall consider exclusively groups which have no elliptic transforma- 
tions, we shall employ the unqualified terms ‘‘Fuchsian group” and “Fuchsoid 
group”, understanding that the groups considered are free from elliptic trans- 
formations. 

The group @(F) being Fuchsian or Fuchsoid, it follows that to any point 
% of |z| < 1, there corresponds a neighborhood N(z) such that every simply- 
connected region g containing z and contained in N (zo) satisfies 


(2.2) Tig: Tig = 0 (75; T; « G(F), Ty # T:) 


We now consider the space F* formed by identifying the points of |z| < 1 
which are equivalent with respect to G(F) [16, p. 31]. Neighborhoods U* (wo) 
of a point wo of F* are defined to be those sets of points of F* which correspond 
to regions of the type g under the identification, where g is to contain a point 
2 of |z| < 1 in correspondence with wy. It is readily seen that with this 
definition F* is a surface. The maps which associate U *(w) with Tg (T « G(F)) 
under the identification define F* as a Riemann surface. The Riemann sur- 
faces F and F* are in (1, 1) correspondence and are conformally equivalent. We 
shall denote F* by (|z| < 1) (mod @(F)), and, in general, by E (mod @) we 
shall denote the space formed by identifying points of a set E of the extended 
é-plane which are equivalent with respect to a Fuchsian or Fuchsoid group ©. 


oe een 


er eee 


< ihta 











282 MAURICE H. HEINS 


Suppose now that we have another map of the type (2.1) of | z| < 1 onto F. 
This second map is automorphic with respect to a Fuchsian or Fuchsoid group 
G@’(F). There exists a linear fractional transformation S mapping | z| < 1 onto 
itself such that 


(2.3) @'(F) = SG(F)S”. 


Conversely, if S is an arbitrary linear fractional transformation mapping | z | < 1 
onto itself, then the Riemann surface (| z| < 1) (mod SG(F)S™’) is conformally 
equivalent to F. Thus there is associated with F a class of Fuchsian or Fuchsoid 
groups, each group being obtained from another by taking the transform with re- 
spect to an appropriately chosen linear fractional transformation S mapping | z | <1 
onto itself, such that when we identify | z | < 1 with respect to any one of these groups 
in the manner indicated above, we obtain a Riemann surface conformally equivalent 
to F. 

On the other hand, if we start with a class of groups, consisting of a given 
non-trivial Fuchsian or Fuchsoid group @, and its transforms, S@S~', where S 
is an arbitrary linear fractional transformation mapping | z| < 1 onto itself, 
there is associated with this class of groups by the above identification process, 
a class of conformally equivalent Riemann surfaces, (|z| < 1)(mod SGS™”). 
These remarks will be significant in the present discussion. 

It may be shown [2, Chap. III] for a Fuchsian or Fuchsoid group G, which 
is not cyclic, that the limit points of the set {72}, where 2 is any point of | z| < 1 
and T ¢ G, consist of either | z| = 1 or else of a perfect totally disconnected set 
of points on |z| = 1. In the first case G is said to be of the first kind, and in 
the second case of the second kind. 


3. The families @ and % 


Let F be a given continuable Riemann surface satisfying the requirement 
imposed in §1 and let wo denote a given point of F. By hypothesis there exists 
a Riemann surface G and a (1, 1) directly conformal map of F onto a proper 
subset of G. Let W = W(w) denote this correspondence and let Wo denote 
W(w). Now let w(z, wo) and W(Z, Wo) define uniformization maps of | z| < 1 
and | Z| < 1 onto F and G respectively of the type (2.1) such that w(0, wo) = w 
and W(0, Wo) = Wo. The function Z = f(z) which is uniquely defined by the 
requirements 


(3.1) JO) = 0, — WIf(z), Wo] = Wiw(z, wo)] 


has the following properties: 
Ai) Z = f(z) ts single-valued, analytic and of modulus less than unity for | z | < 1. 
As) If T « G(F), then f(T) = Urlf(z)] where Ur e G(G). 
As) If f(z) ts equivalent to f(z.) with respect to G(G), then z is equivalent to 2 
with respect to @(F). 
We remark that we may fix w(z, wo) (which depends upon a single real parameter 








oF. 
oup 
nto 


<1 
ally 
soid 
Te- 
<1 
Ups 
lent 


ent 
ists 
per 
ote 
<i 
“3 Wo 
the 


ON CONTINUATION OF A RIEMANN SURFACE 283 


6) once and for all and hence @(F’). This choice of w(z, wo) having been made, 
we may choose W(Z, Wo) so that 

Ay) f’(0) > 0. 

Observe that f(z) defines a homomorphism or an isomorphism of @(F) into 
G(G). Two cases are to be distinguished: either the image of G(F’) under this 
homomorphism is precisely @(G), or else the image of G(F) is a proper part of 
G(G). This latter situation is illustrated by the example of a Fuchsian group 
& of the second kind. The group G is properly discontinuous in the extended 
z-plane save at a perfect totally disconnected set of points ZH on|z| = 1. If F; 
is the Riemann surface (| z| < 1) (mod @) and F, is the Riemann surface 


[extended z-plane deleted in E] (mod @) 


the latter is a continuation of the former. Upon examining the structure of the 
homomorphism of @(F;) into @(F2) defined by f(z) of (3.1), we find that the 
image of @(F;) is a proper subgroup of G(F2). 

On the other hand, let there be given two groups G:{7'} and T:{S}, Fuchsian 
or Fuchsoid, and let there exist a function Z = f(z) which is analytic and of 
modulus less than unity, and which satisfies in addition to the condition f(0) = 0 
the conditions Ai:-A,, where @ replaces G(F) and T replaces G(G@). The Rie- 
mann surface G = (| Z| < 1) (mod PL) is a continuation of the Riemann surface 
F = (|z| < 1) (mod ©). This follows from A: , As, and the analyticity of 
f(2). 

It is to be observed that only those elements S of I occur in the homomorphic 
image of © defined by f(z) for which SO is the image of some point of | z| < 1 
under the map Z = f(z). Hence the strict homomorphic image of G, Iv, 
defined by f(z) also yields a Riemann surface G) which is a continuation of F, 
for f(z), G, To satisfy the conditions A,-A, with IT replacing T. We shall de- 
note the class consisting of F and all its continuations by ®, and by © that 
subclass of consisting of F and all its continuations {G,} for which some f(z) 
of (3.1) defines a strict homomorphic map of G(F) onto G(G)). 

It is apparent from the conditions A;—A, that, if F admits a continuation 
Ge, it always admits a continuation Gp €® . 


4. The family as an £-space [3] 

Let there be given a sequence of elements {G;} (k = 1, 2, --- ) and an element 
Goof). The sequence {G;} will be said to have the limit Gy , denoted by &i.. Gi , 
if to each G, there corresponds a function f; (z) satisfying the conditions A;-A, 
with @(@) replaced by G(G;), and if to Gp there corresponds a function fo(z) 
satisfying A;-A, with @(G) replaced by @(G») such that 


(4.1) lim filz) = folz) 


for |z| < 1, and each f; and fy define strict homomorphic mappings of @(F) 
onto G(G;) and G(G) respectively. With this definition of a limit, ) becomes 














284 MAURICE H. HEINS 


an &-space in the sense of Fréchet. Our first object will be to show that 4 js 
compact. That is, given any sequence of elements {G;} of ®, there exists a 
subsequence {G,«)} and an element Go of &) such that 


(4.2) £& Grck) — Go. 
ko 
As a preliminary step in the proof of the compactness of ) we establish 


(4.3) u = glb. f’(0) > 0, 


where f(z) is taken over the class of functions which satisfy the conditions 
A,-Ax, for some @(G), G e% (or ®). If the statement (4.3) were not true, there 
would exist a sequence of functions {f,(z)} (n = 1, 2, --- ) such that f,(z) satis- 
fies A;—A,y for G(G,) and limn., f.(0) = 0. We denote - (0) by an, and con- 
sider in place of the functions f,(z), the functions 


(4.4) $n(2) = fn(Z)/on (n — 1, 2, ae ). 


From A;—A, we infer that the function ¢,(z) satisfies the following conditions: 

B:) ¢,(z) is analytic for |z| < 1, ¢,(0) = 1. 

B») If f(T) = UF’ [f,(2)], where Te @(F), UF? eG(G,), then gn(T) = 
Vy len(2)], where VS? = 0, US" on, on being the linear transformation z | az. 

Bs) If ¢n(z1) is equivalent to ¢n(z) with respect to o7, G(G,)o,, then 2% is 
equivalent to z. with respect to G(F). 

The proper discontinuity of G(F) coupled with condition B; implies that to 
each point 2 of |z| < 1 there corresponds a neighborhood N(z) such that 
every member of the sequence {y,(z)} is univalent in N(z). In particular, 
since ¢,(0) = 0 and Cn (0) = 1, there exists a subsequence, {¢nm)(z)}, of {¢n(z)} 
which converges uniformly in N(0) and has as its limit function a function 
univalent in N(0). However, since ¢,(0) = 0 (n = 1, 2,---) and ¢,(2) is 
univalent in N(z) for each z in |z| < 1, the family {g,(z)} is normal for 
|z| < 1 [7, p. 34] and hence by Stieltjes’ theorem [7, p. 28], the sequence 
{Gncm(2)} converges continuously in the sense of Carathéodory for |z| < 1. 
Let go(z) denote limm_.,, gnim)(z). We note (0) = 0, go(0) = 1. The linear 
fractional transformation V>‘” converges to the linear fractional transformation 
Vr, which, a priori, is either the identity or else a translation of the Z-plane 
preserving Z = «©. The condition Be yields 


(4.5) go(T) = Vrigo(z)]; 


and the condition B; and the fact that go(0) = 0, ¢o(0) = 1 imply that, if go(a) 
zs equivalent to go(z2) with respect to the group {V7}, then z is equivalent to 2 with 
respect to G@(F). To see this, we note that, as a consequence of go(0) = 0 and 
¢o(0) = 1, go(z) is univalent in every N(z) where each member of the sequence 
{gn(z)} is univalent. In particular, go(z) maps N(z) (1, 1) and directly con- 
formally onto a region in the Z-plane containing go(z2) in its interior. Since 





oe ee —_ te 


oo <_< 








> is 
Sa 


S: 


Ong. 
1 is 


5 to 
hat 
lar, 
(2) } 
‘ion 
) is 

for 
nce 


lear 
Hon 
ane 


(21) 
pith 
and 
nce 


nce 





ON CONTINUATION OF A RIEMANN SURFACE 285 


yo(z1) and ¢o(22) are equivalent with respect to {Vr}, let V denote that member 
of {Vr} for which 


(4.6) go(z2) = V[go(z)]. 


Now V = limnoe V"”, where V"“” € @(Gacmy). Hence for m sufficiently large, 
Vv" [go(z1)] lies in the image of N(z) under ¢o(z), golN(z2)]. Since go(zi) = 
liM mvc Pn(m) (21), V"” fencm)(21)] also lies in go[N(z)] for m sufficiently large. 
Further, for m sufficiently large and for p(> 0) sufficiently small, the circle 
N,,(go(ze)) with center go(ze) and radius p is covered by the images of N(z2) with 
respect tO Yn(m)(z). The sequence {VV [oncm)(21)]} has for its limit as m > ©, 
(2), and hence for m sufficiently large V"” [enc (21)] lies in N p(o(z2)) and 
hence V"” [gncmy(21)] has an antecedent £ncm) in N(z) with respect to the map 
nim) (2). By Bs &ncm) is equivalent to z with respect to G(F). Replacing N (22) 
by a sequence of neighborhoods of z2 , {M,(z2)}, Mp(z2) C N(z), (p = 1, 2, --- ) 
where the diameter of M, — 0 as p — ©, we find by repeating the above 
argument 

(4.7) lim {nim = 22. 

But n¢m) is equivalent to z, with respect to of). The proper discontinuity of 
@(F) implies €n(m) = 22 for m sufficiently large. In other words, z is equivalent 
to 2; with respect to @(F) as we wished to prove. 

The local univalence of go(z) guarantees the proper discontinuity of the group 
{Vr} in the finite Z-plane. Suppose, for example, that {V7} contained a se- 
quence {V;} (Vi # Z, k = 1, 2, --- ) which converged to Z itself. Consider 
any neighborhood of z = 0, N(0); its image under go(z) would contain a set of 
points {Z,} (Z, # 0) equivalent to Z = 0 with respect to {Vr} which has 
Z = 0 asa limit point. The antecedents of Z; in N(0) would then have z = 0 
as a limit point and would be equivalent to z = 0 with respect to the group 
(Ff). This contradicts the proper discontinuity of G(F). 

The group {Vr} must necessarily be one of the following: the identity, a 
simply-periodic group of translations, a doubly-periodic group of translations. 

The system of relations (4.5) and the modified statement of B; with gp re- 
placing gy, and {V7} replacing G(G,), together with the stated restrictions on 
{Vr}, imply that F admits the Riemann sphere or a closed Riemann surface of 
genus one as a continuation. But this is contrary to our assumption on F 
(cf. $1). Hence (4.3) is valid. 

The compactness of &) now follows. For let us consider a sequence of func- 
tions {f,(z)} where f,(z) (n = 1, 2,---) correlates the Riemann surfaces F 
and G, in accordance with A;—Ag , fn(2) replacing f(z) and @(G,) replacing G(@). 
The sequence {f,(z)} is bounded by unity and is hence normal for |z| < 1. 
To each point z of |z| < 1 there corresponds a neighborhood, N (zo), in which 
f.(2) is univalent (n = 1, 2,---). There exists a subsequence {fncm(z)} which 
converges continuously for |z| < 1 as m— ©. Denote limm-so fnacm(z) by 



































286 MAURICE H. HEINS 


fo(z). The requirements A; and Ay are obviously satisfied by fo(z). Since 
fntmy > fo, Ur” — Ur, where, a priori, Ur is either a linear fractional trans- 
formation mapping | Z| < 1 onto itself, or else a constant of modulus unity, 
This latter possibility is to be excluded since fo(0) = 0. The set {U7} consti- 
tutes a group, Wo, since it is the homomorphic image of G(F). Finally, by 
virtue of (4.3) fo(z) is univalent in the N(z) specified above for each z of 
|z| < 1. Hence the group Go must be properly discontinuous for | Z| < 1, 
if the proper discontinuity of G(F) is not to be violated. Obviously Go contains 
no elliptic transformations. The requirements Az and A; are met by fo(z) and 
@, , the proof that A; is satisfied being analogous to the proof that ¢o(z) satisfies 
Bs; with {Vr} replacing o, G(G,)o,. But then the Riemann surface G, = 
(|Z| < 1) (mod Gp) is a continuation of F. We have 

THEOREM 4.1: The &-space ®o 1s compact. 

The theorem of Bochner [1, p. 415] is an easy corollary of Theorem 4.1. By 
the compactness of ®) , there exists a function fo(z) and a Fuchsian or Fuchsoid 
group Wy such that A;-A, are satisfied with f = fp and G@(@) = Gp , and f,(0) = u. 
If the Riemann surface (| Z| < 1) (mod Go), which is a continuation of F, were 
not maximal, there would exist a function h(z) and a Fuchsian or Fuchsoid 
group © fulfilling the requirementg Ai—A; with f = h, G@(G) = H and h’(0) < 1. 
Obviously we may take as the homomorphic image of » under the mapping 
defined by h(z). Consider the function 


(4.8) g(z) = hl fo(z)] 


which defines a homomorphic mapping of G(F) onto and satisfies in conjunc- 
tion with § Ai-A,, where the appropriate replacements are made. Hence 
(|Z| < 1) (mod §) belongs to and is a continuation of F as well as of 
(|Z| < 1) (mod G). But g’(0) = h’(0)f,(0) < uw. This is contrary to the 
definition of u. The theorem of Bochner follows. 

THEOREM 4.2: Every continuable Riemann surface admits a maximal con- 
tinuation. 

We note in passing that the F satisfying the condition of §1 always admit 
maximal continuations belonging to & . 





5. Necessary and sufficient conditions that F be continuable 


Let F denote a continuable Riemann surface satisfying the restrictions laid 
down in §1. Up to the present we have made use of the functions f(z) asso- 
ciated with the continuations of F to establish the compactness of &). How- 
ever, we have considered only the grossest properties of f(z) without studying 
the relation of the structure of the homomorphism defined by f(z) of G(F) into 
(G) to the nature of the continuations which F admits. We now propose to 
investigate this structure in greater detail. 

It is, a priori, evident that either for every continuation G of F and every 
associated f(z) the mapping of G(F) into G(G@) is strictly isomorphic, or else 
there exists a continuation G* of F and an associated f(z) such that some element 









o> EE er ce me 


om mee i Ol COD 





nee 
uns- 
ity. 
sti- 
s by 


of 
‘ins 


ind 


fies 


By 
oid 


ere 


oid 


ing 


ne- 
1ce 


he 
mn- 


nit 


id 





ON CONTINUATION OF A RIEMANN SURFACE 287 


of G(F) other than the identity is mapped into the identity of G(G*). If the 
first case occurs, then every f(z) defines a (1, 1) map of |z| < 1 onto a subregion 
of |Z| <1. This is a consequence of A; which states that 


f(a) = fla) 


implies z: is equivalent to z, with respect to some element 7’, not the identity, 
of G(F). But this would imply 


f(Tz) = fa), 


which is contrary to our assumption on F. 

If the second possibility is realized, then a normal subgroup, g, not reducing 
to the identity, of G(F) is mapped on the identity of G(G*) and f(z) is auto- 
morphic with respect tog. Let R denote the image in |Z| < 1 of |z| < 1 
under the map Z = f(z). Any determination of the inverse of f(z), say the 
element at Z = 0 which carries Z = 0 into z = 0, can be continued analytically 
throughout R by virtue of As. The inverse of f(z) takes on each value of 
|z| < 1 precisely once. If R were simply-connected, the inverse of f(z) would 
be single-valued in R, and this would contradict the fact that f(z) is auto- 
morphic with respect to a group g not the identity. Hence FR is a multiply- 
connected region lying in | Z| < 1 and Z = f(z) is a uniformization mapping 
which defines | z| < 1 as the universal covering surface of R. 

It may be possible to express the boundary of R, to be denoted by BR, as the 
sum of two disjoint closed sets o; and oz, one of which, say o:, lies wholly in 
|Z| < 1 and is totally disconnected. If this is so, there exists a closed Jordan 
curve c lying wholly in R with diameter as small as we please, such that the 
interior of c contains points of BR, and no two distinct points of the closure of 
the interior of c are equivalent with respect to G(G*). If this is not the case, 
then there exists as a consequence of the multipie connectivity of R a continuum 
K lying wholly in | Z| < 1, which is a maximal connected component of BR. 
This continuum K does not contain a pair of points Z; and Z: with the property 
that Z, = UZ, , where U(# I) e G(G*). Otherwise, K being maximal, 


(5.1) UK = kK, 


and, U being either hyperbolic or parabolic, K would not be at a positive dis- 
tance from |Z| = 1. Thus there exists a positive number e for which the 
e-neighborhood of K contains no two distinct points which are equivalent with 
respect to G(G*), and consequently there exists a closed Jordan curve k lying 
in the intersection of R and the e-neighborhood of K, such that K is in the 
interior of k. 

When it is noted that the region R is invariant under the transformations of 
G(G*), the significance of the existence of the closed Jordan curves c or k becomes 
clear. For it is readily verified that F is conformally equivalent to R 
(mod G(G*)). The choice of c or k was such that no two distinct points in the 











288 MAURICE H. HEINS 


closure of the interior of c or k were equivalent with respect to G(G*). Hence 
the set of points in R (mod G(G*)) 
(5.2) >>  R-[interior c or k] (mod G(G*)) 

U e &(G*) 
1, conformally equivalent to the intersection of R and the interior of ¢ or k, 
and is bounded in part by 


(5.3) v= > U(cork) (mod G(G*)), 


U € G(G*) 


which is a closed Jordan curve lying in R (mod G(G*)). The retrosection y 
divides R (mod @(G*)) into two disjoint open connected sets one of which is 
(5.2). This implies that R (mod @(G*)) and hence F which is conformally 
equivalent have boundary elements of the first type [5, p. 164]. We have 

THEOREM 5.1: A necessary condition that there exists a continuation G* of F 
such that some f(z) of (3.1) defines a homomorphism of G(F) into G(G*) carrying 
anormal subgroup g (> but ¥ I) into the identity of G(G*) is that F have a boundary 
element of the first type. 

The converse of this theorem, which is also true, may be demonstrated by a 
device due to de Possel [12, p. 17]. If F has a boundary element of the first 
type, then there exists a retrosection y of F such that F — y is the sum of two 
disjoint open connected subsets of F, F; and F: , one of which, say F;, is planar 
(i.e. homeomorphic to a plane region) and multiply-connected. As a conse- 
quence of a theorem due to Koebe [6], Ff; may be mapped (1, 1) and directly con- 
formally onto a plane multiply-connected region ‘f, lying in the interior of the 
unit circle | #| = 1 in the ¢-plane, such that under this map y and | ¢| = 1 are 
in (1, 1) continuous correspondence. The Riemann surface F* formed from F 
and |t| < 1 by identifying the corresponding points of F; and ‘f; is a continua- 
tion of F. ‘To show that some associated f(z) of (3.1) defines a homomorphism 
of G(F) into G(F*) carrying a normal subgroup g (> but # J) into the identity 
of G(F*), observe that there exists a closed Jordan curve IT lying in {; , which 
contains in its interior a part of the boundary of F,. The image of IT in F 
cannot be deformed to a point in F, whereas the image of I in F* can be de- 
formed to a point in F*. Hence our assertion follows. 

THEOREM 5.2: If F has a boundary element of the first type, then F admits a 
continuation F* such that some associated homomorphic mapping of &(F) into 
G(F*) carries a normal subgroup of (F),g4 (> but ¥ I), into the identity of G(F*). 

Let us now consider necessary and sufficient conditions that a Riemann sur- 
face F admit a continuation G such that some f(z) of (3.1) defines an iso- 
morphism of @(F) into G(@). We shall give two such conditions, one in terms 
of the structure of @(F), the other in terms of intrinsic properties of F. 

Recall that, if an f(z) of (3.1) defines an isomorphic mapping of @(F) into 
G(G), then f(z) defines a (1, 1) map of | z| < 1 onto a subregion of | Z| < 1. 
Under these circumstances, G(F) must be of the second kind. Contrary to 





V 


a Don aes ee 


ses © 








nce 


m ¥ 
h is 
ally 


if F 
ying 
lary 


vy a 
first 
two 
nar 


con- 
the 
are 
m F 
1ua- 
ism 
tity 
hich 
nF 
de- 


its a 
into 
F*), 
sur- 
iso- 


into 
é& 
y to 





ON CONTINUATION OF A RIEMANN SURFACE 289 


this assumption, G(F) would be of the first kind—that is, the set of points {¢} 
defined by 


(5.4) T%=¢ (Te G(F)) 


would be dense on |z| = 1. If 7 is parabolic, let « denote an oricycle lying in 
|z| < 1 save for the point ¢ and tangent to |z| = 1 at ¢; and if T is hyper- 
bolic, let x denote a circular arc lying in | z | < 1 except for its endpoints ¢ and ¢ 
(the other fixed point of 7). By A» when z tends to ¢ along x, f(z) tends to a 
fixed point of Uy , which is the unique fixed point of Ur , if Uy or T is parabolic 
(in this latter case it may be shown that U, is then necessarily parabolic) ; 
otherwise, f(z) tends to that fixed point of Uy which is lim),)_,.. U> the {n} being 
the positive or negative integers depending upon whether the n are positive or 
negative in the relation 

lim 7” = ¢. 

|n| 0 
It follows from a lemma due to Carleman [9, p. 65] that, when z tends radially 
to ¢, f(z) tends to the fixed point of Ur prescribed above. 

Since we are assuming that G is a non-trivial continuation of F (i.e. G is not 
conformally equivalent to F), the boundary of R, where R is the image of 
|2| < 1 under f(z) must contain points in | Z | < 1 and hence points in | Z| < 1 
which are accessible from R. Let wz denote an accessible boundary point of R 
in| Z| < 1, and let az denote a Jordan are connecting Z = 0 and wz, lying 
in R save for the endpoint wz. The antecedent with respect to f(z) of az with 
wz deleted is such that when Z tends to wz on az, its antecedent tends to a 
unique point, w,, on |z| = 1. Application of Carleman’s lemma shows that, 
as z tends to w, radially, f(z) tends to wz. On the assumption that G(F) is of 
the first kind, there exists a sequence of ares {8,} lying on |z| = 1 where 


oa , , 
Bn ” Bnyi (n = 1, 2,---), each 8, contains w,, the endpoints, ¢{” and ¢%, 


of 8, are fixed points of transformations of @(F), and the length of 8, tends to 
zero monotonically. For each positive integer n, the point wz is contained in 
the Jordan region, ¢, , which is defined by the following two properties: 

1’ It is bounded by the images with respect to f(z) of the radii joining z = 0 


to the points ¢%” and ¢° respectively and one of the two ares of | Z| = 1 with 


endpoints the fixed points of transformations of @(G@) associated with ¢{” and 
ty” in the manner indicated above. 

2’ c, contains the image with respect to f(z) of the region bounded by the 
radii joining z = 0 to ¢%” and ¢% respectively and the are B, . 

Consider the maximal connected component containing wz of the intersection 
of the boundary of R and the set of points C: | Z — wz| S p, where p is positive 
and is chosen so small that C lies in | Z| < 1; denote this component by @. 
I say that 2 C o, (n = 1, 2,---). This is a consequence of the fact that 


ozéo,(n = 1,2,---). Since the set 2 obviously does not consist solely of the 


= odaeee 




















290 MAURICE H. HEINS 


point wz, and since the set of points of the boundary of R which are accessible 
from R are dense on the boundary of R, the set 2 contains a point w,(+ wz) 
which is accessible from R. Repeating the argument employed before in con- 
sidering wz , we find that there exists a point w, on |z| = 1 such that, when z 
tends to w, radially, f(z) tends to w,. But the point w, must lie on the are B, 
for every positive integer n; and since the length of 8, tends to zero as n > «, 
w, must necessarily coincide with w,. The contradiction is manifest. Hence 
G(F) is of the second kind. 

The converse is also true. Let F be a Riemann surface with G(F) of the 


second kind and let & denote the set of points on | z| = 1-where G(F) is not 
properly discontinuous. The Riemann surface, 
(5.5) G = [Extended z-plane — &] (mod G(F)), 


is a continuation of F. It is very simple to exhibit an f(z) of (3.1) associated 
with this continuation. Observe that the region P consisting of the extended 
z-plane deleted in the set & is a smooth, unbounded, regular covering surface 
of G. Let W(z) denote the mapping of P onto G defined by the identification 
(5.5), and let 2 = ¢(Z) denote the uniformization mapping of | Z| < 1 onto P, 
where (0) = 0 and ¢’(0) > 0. Then W[g(Z)] defines | Z| < 1 as the uni- 
versal covering surface of G. Observe that W(z) defines | z| < 1 as the uni- 
versal covering surface of (| z| < 1) (mod @(F)) which is conformally equiva- 
lent to F. Now, proceeding as before, we see that f(z) may be defined by 


(5.6) Wz) = Wle(JFlz))) 
where f(0) = 0. This implies, since g(0) = 0, that the relation 
(5.7) z= glf(z)] 


holds for |z| < 1. The univalence of z itself implies the univalence of f(z). 
Hence the mapping of G(F’) into G(@) defined by f(z) is isomorphic. In résumé, 
we have 

THEOREM 5.3: A necessary and sufficient condition that a Riemann surface F 
admit a continuation G with the property that an associated f(z) of (3.1) defines an 
isomorphic mapping of G(F) into G(G) is that G(F) be of the second kind. 

We may also characterize Riemann surfaces F for which @(F) is of the second 
kind intrinsically in terms of F. Recall that a transversal o of an open surface 
F is a (1,1) continuous image lying in F, w(t), of the open unit interval 
(0 < ¢ < 1) and having the property that, whenever the sequence {t;,} tends to 
zero or one, the sequence {w(t,)} does not have any limit point in F. We shall 
agree to call a transversal a 7-transversal if the following conditions are fulfilled: 

1° It separates F in such a manner that one of the connected components, ¥t, 
is simply-connected. 

2° When 9 is mapped (1, 1) and directly conformally onto the interior of the 
unit circle, | ¢ | = 1, the length of the are of | ¢ | = 1 corresponding to r isless 
than 27. 










Cn EO) EE a a ae 


S SseReoma_ 


hs Oo 


te 








ssible 
wz) 
1 con- 
yhen z 
are B,, 
—> @ 


Hence 


of the 
is not 


ciated 
ended 
urface 
cation 
ito P, 
e uni- 
e uni- 
juiva- 


f f(z). 


sumé, 


face F 
nes an 


econd 
urface 
terval 
nds to 
e shall 
Ifilled: 
its, N, 


of the 
isless 


ON CONTINUATION OF A RIEMANN SURFACE 291 


It is readily seen that the existence of a r-transversal on a Riemann surface F 
is a conformally invariant property of F. 

We now propose to show 

TurorEM 5.4: A necessary and sufficient condition that @(F) be of the second 
kind is that F admit a r-tranversal. 

The necessity follows at once. Let e” be a point of | z| = 1 where @(F) is 
properly discontinuous and let « denote the intersection of |z| < 1 with the 
circumference of a circle with center e” and radius chosen so small that 
Trk-Trxk = 0(Tm, Tr €@G(F); Tn ¥ T,). Then 
(5.8) > Tx (mod @(F)) 

T €@(F) 
is a 7-transversal of (|z| < 1) (mod @(F)) and this latter Riemann surface is 
conformally equivalent to F. 

The sufficiency of Theorem 5.4 may be demonstrated as follows. If F admits 
a r-transversal, the antecedent of # in | z| < 1 with respect to the uniformiza- 
tion mapping of | z| < 1 onto F consists of the sum of disjoint simply-connected 
regions. We prefer one of these, say 7. Since 7 is in (1, 1) directly conformal 
correspondence with § under the uniformization mapping of |z| < 1 onto F 
by virtue of the monodromy theorem, and since § is in (1, 1) directly conformal 
correspondence with |¢| < 1 in the manner indicated above, by composing 
these two mappings, we find that there exists a univalent analytic function, 
2(¢), which maps | ¢| < 1 onto 7 in such a manner that, whenever ¢ tends to 
any interior point of the arc of | ¢ | = 1 complementary to the are corresponding 
to 7, | 2(¢)| tends to unity. Application of Schwarz’s reflection principle 
[4, p. 45] yields the conclusion that 2(¢) can be continued analytically across 
the interior of the are of | ¢ | = 1 complementary to the are corresponding to rt. 
So continued 2(¢) maps a sufficiently small neighborhood N of an interior point 
of the complementary are (1, 1) and directly conformally onto a region in the 
z-plane containing a point of | z| = 1 inits interior. We take the neighborhood 
N to be the interior of a circle with its center the point of the complementary 
are in consideration. Hence the intersection of N and | ¢| < 1 is mapped by 2(¢) 
onto a region on the z-plane lying in rp and bounded in part by an are of | z| = 1. 
But 7) was so defined that no two distinct points of 7» are equivalent with respect 
to G(F). Therefore there are points on |z| = 1 where @(F)"is properly dis- 
continuous. The group @(F) must be of the second kind.’ 


‘de Possel has introduced in his thesis [12, p. 15] a concept closely related to our r- 
transversals. It is that of a simply-connected region D on F which is bounded in part by a 
finite number or denumerable infinity of transversals of F; D has the property that, when 
it is mapped (1, 1) and directly conformally onto the interior of the unit circle, the arcs of 
the circumference of the unit circle corresponding to the transversals form a set which is 
not of maximal type (l.c.). We have preferred, however, the concept of r-transversal to 
that of the regions D because the former concept is apparently the more primitive and 
simpler. 































292 MAURICE H. HEINS 


The following theorem follows readily from the preceding four. 
TueoreM 5.5: If G(F) is of the second kind, then F is always continuable. If 
G(F) is of the first kind, then F is continuable if and only if F admits boundary 
elements of the first type. 
We remark that, if @(F) is of the first kind and if F is continuable, then the 
boundary elements of the first type of F may be characterized conformally by 
the fact that F admits no 1-transversals. 


6. The exhibition of maximal continuations 


If a given continuable Riemann surface F has no boundary elements of the 
first type, it is easy to exhibit a maximal continuation of F. Indeed, the 
Riemann surface G defined in (5.5) is a maximal continuation of F. To estab- 
lish this, it suffices to show that (a) G(G@) is of the first kind, and (b) G has no 
boundary elements of the first type (Theorem 5.5). 

The assertion (a) may be demonstrated as follows. With the region P con- 
sisting of the extended z-plane deleted in & (§5) we associated its uni- 
formization mapping, ¢(Z), which is automorphic with respect to a Fuchsoid 
group, @(P), which is of the first kind since & is totally disconnected. Now G(@) 
may be taken to be precisely the properly discontinuous group with respect to 
which W[g(Z)] is automorphic ($5). Hence, G(P) being of the first kind, 
(G) is also and the assertion (a) follows. 

Can G admit boundary elements of the first type? If such boundary elements 
existed, then G would admit a retrosection y separating G such that G — y is 
the sum of two disjoint, open, connected sets, one of which, G;, is planar and 
multiply-connected. 

The surface G itself may be expressed as the sum of the following disjoint sets: 
1° F, = (|z| < 1) (mod @(F)); 2’ F. = (1 < |z| S$ &) (mod G(F)); 3° (the 
set of points on | z| = 1 complementary to &) (mod @(F)). The set 3° consists, 
a priort, of asystem {o,} (finite or denumerable) of transversals and retrosections 
of G. It cannot however contain any retrosections of G, since, if this were so, 
F, which is conformally equivalent to F would have boundary elements of the 
first type. 

It is clear that the region G; must contain points of both F; and F,. If 
G, C F,, then F; would have boundary elements of the first type; similarly, if 
G, C F, , F2 , which homeomorphic to F; , would have boundary elements of the 
first type. Since y is compact on G, it has points in common with only a finite 
number of the o,. The set G; deleted in the points of o; in G, is the sum (finite 
or denumerable) of planar regions, g; , each of which must lie wholly in F; or 
wholly in F2 , since a point of F; cannot be connected to a point of F2 without 
crossing some o,. Furthermore, each g; must be simply-connected; otherwise 
either F; or F2 would have boundary elements of the first type. 

Observe that at least one g; is not compact on G. Otherwise, each g; would 
be bounded by a closed Jordan curve consisting of points of y and of a compact 
subset of the finite number of the o; having points in common with 7, this subset 



















>» If 
ndary 


n the 
ly by 


f the 
|, the 
stab- 
as no 


 con- 

uni- 
hsoid 
G(G) 
ct to 
kind, 


nents 
- vis 
r and 


, Sets: 
’ (the 
sists, 
tions 
re so, 
of the 


om 
rly, if 
of the 
finite 
finite 
F, or 
thout 
TWIse 


vould 
npact 
ubset 





ON CONTINUATION OF A RIEMANN SURFACE 293 


being independent of the index l. It would follow that the intersection of G, 
with >» o, is compact on G. Now G; itself is not compact on G; hence there 
must exist a (1, 1) continuous image of (0 S t < 1), w(t)(C G), such that, 
whenever t, > 1, {w(t,)} is properly divergent. The are w(t) must lie wholly 
in F; or wholly in F for ¢ sufficiently near unity. Assume that for ¢ = t > 0, 
w(t) C Fi, the treatment of the other possibility being similar. Each g; , being 
assumed compact, for t 2 tf , w(t) must have points in common with an infinite 
number of distinct g; and hence must cross y for values of ¢ arbitrarily near one. 
This is manifestly impossible since y is compact on G. 

Let gi, denote therefore a g; which is not compact on G and which lies in F; . 
The simply-connected region g;, is bounded in part by an are of y having end- 
points on distinct o,. The antecedent of g:, in |z| < 1 with respect to the 
identification mod @(F) consists of the denumerable sum of disjoint simply- 
connected regions, every one of which is bounded by a closed Jordan curve con- 
sisting of points of the antecedent of y and of |z| = 1, this Jordan curve 
containing an arc of | z| = 1 with a point where G(/) ceases to be properly dis- 
continuous in its interior, since g;, is not compact on G. This violates the 
monodromy theorem, since each component of the antecedent of g;, cannot 
contain distinct points which are equivalent with respect to G(F), and yet each 
component of the antecedent of g;, must contain infinitely many distinct points 
equivalent with respect to G(F) in the neighborhood of the points on its boundary 
where @(F’) ceases to be properly discontinuous. We infer 

THEOREM 6.1: If F is continuable and has no boundary elements of the first type, 
then G of (5.5) defines a maximal continuation of F. 


7. A special type of maximal extension 


In accordance with the theorem of Koebe already cited [6] it is possible to 
map a planar Riemann surface Fy (1, 1) and directly conformally onto a plane 
region. Furthermore, it is possible to map a plane region (1, 1) and directly con- 
formally onto a region which is dense in the extended complex plane. This is 
evident, if the region is simply-connected; on the other hand, if the region is 
multiply-connected, it follows from a theorem of de Possel [13] which states that 
any multiply-connected plane region may be mapped (1, 1) and directly con- 
formally onto a plane region bounded by a totally disconnected set (possibly 
vacuous) and a system of segments parallel to the real axis so that this image 
region is dense in the extended plane. 

Does an arbitrary Riemann surface F admit a maximal continuation G such 
that there is in G a (1, 1) directly conformal image of F which is dense in G? 
This question was proposed to the author by Professor Bochner. As we shall 
see, it is to be answered in the affirmative. Here, too, we shall assume that F 
does not admit the Riemann sphere or a closed Riemann surface of genus one 
as a maximal continuation (§1); the treatment of the former case has been 
indicated and the proof in the latter case may be readily supplied. 





294 MAURICE H. HEINS 


We shall say that a Riemann surface G which is a continuation of a given 
continuable Riemann surface F and which has a dense subset (1, 1) directly 
conformally equivalent to F, is a dense continuation of F. Let Wo denote the 
class of Riemann surfaces G which are dense continuations of F. 

The class Vo is not vacuous. Recall that, if F is continuable, then either F 
has boundary elements of the first type, or else admits a 7-transversal (§5). In 
the first case, the existence of a G € WY is assured by the proof of Theorem 5.2. 
In the second case, let d denote the simply-connected subregion of F which is 
bounded in part by 7. The region d may be mapped (1, 1) and directly con- 
formally onto d, , consisting of the interior of the unit circle in the (= 2 + im)- 
plane deleted in the set (0 S a < 1), in such a manner that the transversal + 
corresponds to the circumference | «| = 1 deleted at the point x = 1. Using 
the device of de Possel once again to identify points of |x| < 1 slit in 
(0 < x < 1) and points of d which correspond under the conformal transforma- 
tion between the two regions being considered, we obtain a dense continuation 
of F. Hence % is not vacuous. 

If there exists a G ¢€ Wp» which is maximal, the problem is settled. Otherwise, 
let v denote g.l.b.; <0, f’(0) where © is the class of f(z) of (3.1) associated with 
the G of Wo for which the corresponding map of F into G defines G as a dense 
continuation of F. Clearly v = yp (ef. (4.3)). A remark of importance is that 
f(z) of (3.1) belongs to ©, if and only if the image of |z| < 1 under f(z) is 
dense in| Z| <1. This implies that Yo C &. 

Let G, «Wp. and have the property that there is an associated fi(z) € Oo for 
which fi(0) < 3v/2. Such a G; and fi(z) exist. Since G; is continuable, we 
apply the argument just employed for F to Gi. We let % denote the class of 
dense continuations of G; ; we carry over the uniformization of (3.1) taking 
|Z: | < 1 as the universal covering surface of G,, letting z, = 0 correspond to 
the point w}” of G, which is the image of w of F. The class ©; bears the same 
relation to G, that ©) bears to F, and » replaces v. It is to be noted that 
vy, 2 2/3; else there would exist an f € 6 for which f’(0) < v. Proceeding as 
above, we infer the existence of a Gz € V; and an fo(z:) € ©; for which f2(0) < 4/3. 
Advancing step by step, we apply the same argument to G2, letting (0: , %, 
wh”, zn , x.) for k = 2 bear the same relation to this system for k = 1 that this 
system for k = 1 bears to (Oo , Wo , wo, z, v). The connotation of the symbolism 
is clear. We observe that », = 3/4 and that there exists an f3(z2) ¢ V2 with 
f3(0) < 5/4. We carry out this procedure inductively obtaining at the n™ 
stage the set (On , Vn , W)”, Zn» ¥n) ANA fn4i(Zn) € Vn , Where v, > (n + 1)/(n + 2) 
and fnii(0) < (n + 3)yn/(n + 2). 

Now consider the sequence of functions, {h,(z)}, defined by the inductive 
relations 


(7.1) hi(z) = fitz), an (2) = fa(Rna(2)) (n 2 2). 








ven 
tly 
the 


r F 

In 
0.2. 
n is 
on- 
X2)- 
ul + 
sing 
, in 
ma- 
lon 


‘ise, 
vith 
nse 
hat 
) is 


for 

we 
s of 
<ing 
d to 
ame 
that 
g as 
y,/3. 
Vi, 
this 
lism 
with 
nt 


+ 2) 


tive 





ON CONTINUATION OF A RIEMANN SURFACE 295 


The function, h,(z), defines G, as a dense continuation of F in accordance with 
the conditions A;-Ay of §3. The sequence {h,(z)} converges continuously in 
the sense of Carathéodory for |z| <1. This is established by observing that 


(7.2) hi.(0) > uw (n= 1,2,---) 
in accordance with (4.3) and that 

(7.3) | hn(z) | S | hena(z) | 

for |z| < land = 2, 3, --- by Schwarz’s lemma. As a consequence of the 


proof of Theorem 4.1, h(z), the limit function of the sequence, {h,(z)}, defines 
a homomorphism or isomorphism of @(F) onto a Fuchsoid group * with the 
property that G* = (| Z| < 1) (mod @*) is a continuation of F. If we establish 
that G* is a dense continuation of F' and that G* is maximal, then the question 
raised at the beginning of the present section is to be answered in the affirmative. 
Lemma 7.1: The Riemann surface G* is a dense continuation of F. 
Define the sequence {h<"(z,)} by the relations 


(7.1) AS? (ee) =Sesrlee), AS? (ee) =falhPa(ed)] (hk =1, 2, ++ pn = 2,3, -- +). 


The argument applied to {h,(z)} shows that the sequence {h‘” (z,)} converges 
continuously for | z, | < 1asn— . The limit function, h“(z,), defines an 
isomorphism or homomorphism of G(G;) onto G* and G* is a continuation of 
G, (k = 1, 2,---). We also have the further relations 


(7.4) nh” the? (2)] = h(z:) fork <k; h® (h(z)) = h(z) (k = 1). 


Let Gp denote the image of F in G* defined by the map of F into G* asso- 
ciated with h(z) in accordance with §3; and, in general, let G; denote the image 
of G, in G* defined by the map of G; into G* associated with h(a) (k = 
1,2,---). Taking k = 1 + 1 in the first of the relations (7.4) and k = 1 in 
the second, we infer 





(7.5) Gcecéc:-.-. 
We shall show that 

(7.6) lim Gt = G*. 
Note that 
(7.7) lim A(z) =z 

ko 
This follows from the fact that 

dh” 
d (2) | = TI f(0), 
z iz==0 l>k 


coupled with the relation limy.. [[:3.f:(0) = 1 ([]eauS (0) 2). If (7.6) 
were not true, then there would exist a point wa ¢ G*, not belonging to G, for 





294 MAURICE H. HEINS 


We shall say that a Riemann surface G which is a continuation of a given 
continuable Riemann surface F and which has a dense subset (1,1) directly 
conformally equivalent to F, is a dense continuation of F. Let Wo denote the 
class of Riemann surfaces G which are dense continuations of F. 

The class Vo is not vacuous. Recall that, if F is continuable, then either F 
has boundary elements of the first type, or else admits a 7-transversal (§5). In 
the first case, the existence of a G € VY is assured by the proof of Theorem 5.2. 
In the second case, let d denote the simply-connected subregion of F which is 
bounded in part by 7. The region d may be mapped (1, 1) and directly con- 
formally onto d, , consisting of the interior of the unit circle in the z(= a + im)- 
plane deleted in the set (0 S a1 < 1), in such a manner that the transversal r 
corresponds to the circumference |x| = 1 deleted at the point x = 1. Using 
the device of de Possel once again to identify points of |x| < 1 slit in 
(0 S a < 1) and points of d which correspond under the conformal transforma- 
tion between the two regions being considered, we obtain a dense continuation 
of F. Hence WY is not vacuous. 

If there exists a G € WV which is maximal, the problem is settled. Otherwise, 
let v denote g.l.b.; <0, f’(0) where © is the class of f(z) of (3.1) associated with 
the G of Wo for which the corresponding map of F into G defines G as a dense 
continuation of F. Clearly v = wu (ef. (4.3)). A remark of importance is that 
f(z) of (3.1) belongs to ©, if and only if the image of |z| < 1 under f(z) is 
dense in| Z| <1. This implies that Wo C &. 

Let Gi «WY and have the property that there is an associated fi(z) € 90 for 
which f;(0) < 3v/2. Such a G, and fi(z) exist. Since G, is continuable, we 
apply the argument just employed for F to Gi. We let % denote the class of 
dense continuations of G;; we carry over the uniformization of (3.1) taking 
| z:| < 1 as the universal covering surface of G,, letting z; = 0 correspond to 
the point w$” of G; which is the image of w of F. The class ©; bears the same 
relation to G; that 0) bears to F, and » replaces v. It is to be noted that 
v, 2 2/3; else there would exist an f € O for which f’(0) < v. Proceeding as 
above, we infer the existence of a Gz ¢ V; and an fo(z:) € ©; for which f2(0) < 4/3. 
Advancing step by step, we apply the same argument to G2, letting (0: , Y%, 
Ww)”, 2, vx) for k = 2 bear the same relation to this system for k = 1 that this 
system for k = 1 bears to (Qo , Yo , wo, 2, v). The connotation of the symbolism 
is clear. We observe that » = 3/4 and that there exists an f3(z2) «V2 with 
fs(0) < 5/4. We carry out this procedure inductively obtaining at the n™ 
stage the set (On, Vn, ws”, Zn 5 Yn) and fnii(2n) € Wn , Where v, > (n + 1)/(n + 2) 
and fnii(0) < (n + 3)yn/(n + 2). 

Now consider the sequence of functions, {h,(z)}, defined by the inductive 
relations 


(7.1) hi(z) = fiz), — An(2) = fn(Ana(2Z)) 








, we 
ss of 
‘king 
1d to 
same 
that 
ig as 
y;/ 3. 
VM, 
; this 
olism 
with 
e n™ 


+ 2) 


ctive 


> 2). 





ON CONTINUATION OF A RIEMANN SURFACE 295 


The function, h,(z), defines G, as a dense continuation of F in accordance with 
the conditions A;-Ay of §8. The sequence {h,(z)} converges continuously in 
the sense of Carathéodory for |z| <1. This is established by observing that 


(7.2) h.(0) > w (n= 1,2,---) 
in accordance with (4.3) and that 

(7.3) | hn(z) | S | Rns(z) | 

for |z| < land n = 2, 3, --- by Schwarz’s lemma. As a consequence of the 


proof of Theorem 4.1, h(z), the limit function of the sequence, {h,(z)}, defines 
a homomorphism or isomorphism of @(F) onto a Fuchsoid group * with the 
property that G* = (| Z| < 1) (mod @*) is a continuation of F. If we establish 
that G* is a dense continuation of F and that G* is maximal, then the question 
raised at the beginning of the present section is to be answered in the affirmative. 
Lemma 7.1: The Riemann surface G* is a dense continuation of F. 
Define the sequence {h(z,)} by the relations 


(7.1) hh” (zx) = fiir), h\® (zx) =f, [hs (z)] (k ad 5, 2, coe nh = 2, 3, ais -). 


The argument applied to {h,(z)} shows that the sequence {h‘")(z,)} converges 
continuously for | z,| < lasn— o. The limit function, h“(z,), defines an 
isomorphism or homomorphism of @(G,) onto G* and G* is a continuation of 
Gi, (k = 1, 2,---). We also have the further relations 


(7.4) hr” [hp (2)] = h (2) forl<k;  h® (hy(z)) = h(z) (k = 1). 


Let G denote the image of F in G* defined by the map of F into G* asso- 
ciated with h(z) in accordance with §3; and, in general, let G; denote the image 
of G, in G* defined by the map of G; into G* associated with h(z,) (k = 
1,2,---). Taking k = 1 + 1 in the first of the relations (7.4) and k = 1 in 
the second, we infer 


(7.5) GSegqcec.:--. 
We shall show that 
(7.6) lim G, = G". 
Note that ne 
(7.7) lim h™(2) = z. 
This follows from the fact that 
dh‘ 





» (2) “a , 
dz _ ne IT (0), 


coupled with the relation limp. [[o5:f:(0) = 1 ([[eafi(0) = x). If (7.6) 
were not true, then there would exist a point wa ¢G*, not belonging to G, for 








296 MAURICE H. HEINS 


any whole number n. But then the functions, h” (z,) (k = 1, 2, ++ ), would 
omit for | z,| < 1 the set of points {UZ.} where Z, is an antecedent of w} in 
|Z| < 1 and U e@*, and hence the relation (7.7) would be violated. 

It remains to be shown that G* — G? is nowhere dense in G*. Note that 
G* — G? is closed, being the complement of G; in G*, and that by (7.1) it is 











equal to 

(7.8) (Gi — Gr) + (Gl — Gt) + (G@-G) +... 

In accordance with the relations (7.4) Gr — G? is nowhere dense in Gi , and 
Giui — Gf is nowhere dense in Gis for (k = 1,2,---). Hence Gt — G> and 






Gu +s (k = 1, 2,--- ) are nowhere dense in G*. It follows that G* — G; 
is of the first category and hence nowhere dense in G* since it is closed. 

The validity of Lemma 7.1 is established. If G* is maximal, our original 
question is settled. 

Suppose, therefore, that G* is not maximal. Then there would exist a dense 
continuation of G*, say H. Let f*(Z) have the corresponding significance for G* 
and H that fnii(zn) has for G, and Gry1. Since the Riemann surface H is a 












dense continuation of G, (n = 1, 2, --- ), it follows that 
(7.9) Fh (en)] € On (n = 1,2,-+-). 
But v, of O, satisfies », > (n + 1)/(n + 2), and hence by (7.7) 

dpe)! , 






dZ \z—0 


by Schwarz’s lemma f*(Z) = Z, which is manifestly contrary to the assumption 
made. We have 
THEOREM 7.1: A continuable Riemann surface always admits a maximal dense 


continuation. 











8. A remark 








We have left out of consideration those Riemann surfaces which admit the 
sphere or a closed Riemann surface of genus one as a continuation in order to 
preserve the unity of presentation. The treatment of these classes of Riemann 
surfaces follows from our discussion with appropriate modifications. 










Tue INsTITUTE FOR ADVANCED Stupy 


BIBLIOGRAPHY 


1. S. Bocuner, Fortsetzung Riemannscher Flichen, Math. Annalen, vol. 98 (1927).pp. 
406-421. 

2. L. R. Forp, Automorphic Functions, New York, 1929. 

3. M. Frtcuer, Sur quelques points du calcul fonctionnel. Thesis, Paris, 1906. 

4. G. Juuia, Principes géométriques d’ analyse I, Paris 1930. 

5 

6 












. B. v. Ker&xsArté, Vorlesungen tiber Topologie I, Berlin 1928. ; 
. P. Konpy, Uber die Uniformisierung beliebiger analytischer Kurven I11,Géttinger 
Nechrichten, 1008, pp. 337-358. 










would 
*. 
W, in 


> that 
) it is 


, and 


iginal 
dense 


for G* 
Tisa 


yption 


dense 


it the 
Jer to 
mann 


27) .pp- 


ttinge? 





10. 


i. 


12. 
13. 


14. 


16. 
17. 


~ 


H. 
H. 


ON CONTINUATION OF A RIEMANN SURFACE 297 


. MonTEL, Legons sur les familles normales des fonctions analytiques, Paris 1927. 
R. 


NevANLINNA, Ein Satz tiber offene Riemannsche Flachen, Annales Acad. Sci. Fenn., 
Ser. A T.54. No. 3. (1940). 


. NEVANLINNA, Hindeutige analytische Funktionen, Berlin 1936. 
. DE PossEL, Sur le prolongement des surfaces de Riemann, C. R. Acad. Sci. de Paris, 


vol. 186 (1927) pp. 1092-1095. 


. DE PossEL, Sur le prolongement des surfaces de Riemann, C. R. Acad. Sci. de Paris, 


vol. 187 (1928) pp. 98-100. 


. DE PossEL, Quelques questions de représentation conforme. Thesis, Paris 1932. 
. DE PossEL, Zum Parallelschlitztheorem unendlich-vielfach zusam menhdngender Gebiete, 


Géttinger Nachrichten, 1931, pp. 199-202. 


’. Rav, Ueber eine nicht-fortsetzbare Riemannsche Mannigfaltigkeit, Math. Zeitschrif t, 


vol. 20 (1924) pp. 1-6. 


. Rapé, Ueber den Begriff der Riemannsche Flache, Acta Szeged, vol. 2 (1923) 


pp. 101-121. 
SEIFERT AND W. THRELFALL, Lehrbuch der Topologie, Leipzig 1934. 
Wert, Die Idee der Riemannschen Fliche, Leipzig 1923. 


ee 





m ens 








ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 








LATTICE-ORDERED GROUPS 


By GARRETT BIRKHOFF 


(Received December 10, 1941) 











1. Introduction 


We shall be concerned below with lattice-ordered groups, or I-groups,’ in the 
sense of the following definition. 
DEFINITION. An l-group is 
(I) agroup, on which is defined a binary inclusion relation which is “homogeneous” 
in the sense that 


(II) x2y implies a+xex+b2a+y+b) foralla, b, 












and relative to which 
(III) the group is a lattice. 

This defines l-groups as abstract algebras; as such, we can (and shall) apply 
to them such general algebraic concepts as l-subgroup (subalgebra), isomorphism, 
homomorphism, and so on. 

Three important topics will be included as special cases under the single 
heading of |-groups: the additive and multiplicative groups of ordered fields, 
which have long been studied by Hahn, Artin, and others, and are now exten- 
sively used in valuation theory; the study of abstract number and ideal theory 
initiated by Dedekind, and recently amplified by Krull, Ward, Lorenzen, 
Clifford, Dilworth and others; and the semi-ordered function spaces studied very 
recently by Riesz, Freudenthal, Kantorovitch, the author, Bohnenblust, Stone, 
Kakutani, and others. 

It should be stressed, however, that up to the present time only l-groups which 
are commutative or simply ordered have been studied; and it came as a con- 
siderable surprise to the author that the non-commutative case involved so few 
new difficulties. 

The material below breaks up rather naturally into several parts. First 
(§$§2-8) Postulates (I)-(III) are discussed, and various other equivalent systems 
of postulates (together with numerous examples) are derived. Then, after 
brief preliminaries on algebraic formalism, the general structure and decomposi- 
tion theory of l-groups is treated (§§9-13). After this, a complete classification 
of commutative |-groups whose structure lattice has finite length is given (§§$14- 
20). After this, in §§21-26, special properties of complete l-groups are discussed. 
Fifth, two important generalizations are taken up (§§27-28). The paper then 
concludes with a list of sixteen unsolved problems (§§30-31), some of which 
are fundamental. 

































1 We are adopting the convenient terminology of M. H. Stone, ‘‘A general theory of 
spectra. II.,’’ Proc. Nat. Acad. Sci. 27 (1941), pp. 83-87. 


298 












n the 


20uUs” 


apply 
hism, 


single 
ields, 
xten- 
heory 
nzen, 
very 
tone, 


vhich 
) con- 
o few 


First 
stems 
after 
1posi- 
ation 
$§14- 
issed. 
- then 
which 


ory of 


LATTICE-ORDERED GROUPS 299 


2. Explanation of (I) 


The reader will be assumed to be familiar with the definitions of a group and 
of a commutative group, and with the algebraic manipulation of the elements 
of such groups under the additive notation. Thus 0 will denote the group 
identity, —a the group inverse of a, a + b the result of combining a with b, 
and na (n any integer) will denote the n* “power” of a in the cyclic subgroup 
generated by a. 

In the commutative case, the rules of manipulation may be summarized in 
the statement that the group behaves like a vector space over the domain of 
integers. As it may be shown that every element of an |-group is of infinite 
order—that na = 0 implies n = 0 or a = 0,—even the cancellation laws hold. 
In addition,’ an equation of the form nz = ma (n ¥ 0) has at most one solution. 
If such a solution exists, it may be denoted (m/n)a, and regarded as a rational 
scalar multiple of a. In particular (op. cit., §1) the correspondence (m/n)a > 
(m/n) is, for any fixed a, and csomorphism of the set of rational multiples of a 
and a subgroup (which always contains all integers) of the additive group of 
all rational numbers. Thus the set of all rational scalar multiples of a may be 
thought of as a generalized cyclic subgroup. 


3. Explanation of (II) 


The concept of homogeneity applies to any binary relation on a group. 

DeriniTion. A binary relation = on a group G is called left-homogeneous if 
and only if x = y implies, for all a ¢ G, thata + « = a + yand right-homogeneous 
if and only if itimpliesx +a 2y-+a. A relation which is both left- and right- 
homogeneous is called homogeneous. 

THEOREM 1. Homogeneity is equivalent to the assertion that 
(II’) every group translation x > a + x + b is a lattice-automorphism. (I)* 

Proor. The condition (II’) is clearly sufficient. To prove its necessity, 
recall first that all group translations are one-one. Second, not only does 
zt2yimplya+2+b2a+y+b), but converslya+zx+b2a+yt+b 
implies 


(—a) t}a+ae+b + (—b) = (-a) +at+ytb+t (0) 


orz 2 y. That is, (II) implies that any group translation is an automorphism 
with respect to the relation =, as asserted. 
THEOREM 2. Homogeneity is equivalent to the assertion that every transformation 


of the form’ x + a — x + b is a dual automorphism. (1) 





* For the special properties of such groups, cf. Reinhold Baer, ‘‘Abelian groups without 
elements of finite order,’”’ Duke Jour. 3 (1937), pp. 68-122. 

* The postulate numbers in parentheses after the statement of a theorem refer to the 
postulates which are needed to prove the theorem in question. Many of the theorems below 
have a generality which far transcends the theory of l-groups. 

‘ Especially interesting are the ‘inversions’ z > a — x + a, which are of period two and 
have a for fixpoint. 








300 GARRETT BIRKHOFF 


(This means that the correspondence replaces the given homogeneous rela- 
tion by its converse.) 
Proor. Assuming (II), « 2 y is equivalent to 


a+ (—2)+2+(-y) +b2a+ (—z) +y + (-y) +b 


by Theorem 1. But this isa — y + b 2 a — x + b, and so the condition is 
necessary. It is sufficient since it implies that x + 0 — ((—b) + x + (—a)) + 
0=a-—2x+b is the product of two dugl automorphisms, hence an 
automorphism. 

Derinition. An element a of an |-group G is called positive if a = 0. The 
set of all positive elements of G will be denoted G*. 

TurorEeM 3. Homogeneity is equivalent to the assertion that, for some set of 
“positive” elements invariant under all inner automorphisms x + —a + x +4, 
x = y tf and only if x — y is positive. (I) 

Proor. Assuming (II), clearly x 2 y if and only if —-y2y—y=0; 
moreover ¢ = Oimplies -a+t+a2-—a+0+a=Oforalla. Conversely, 
for any set S invariant under all inner automorphisms, the relation (x — y) «8 
is homogeneous since (a + « + b) — (a+y+ )b) = —(—a) + (a — y) + (a) 
is, for all a, b eG, the transform of « — y under an inner automorphism. 

Corotiary. If G is commutative, homogeneity is equivalent to the assertion 
that, for some set of positive elements, x = y if and only if x — y is positive. 


4, Explanation of (III) 


Postulate III asserts that the inclusion relation x 2 y satisfies the usual 
conditions, 
Pi. Forallz,x 2 x 
P,. Ifx 2 yandy 2 xz, thnx = y, 
Ps. Ifx 2 yandy 2 z,thenx 2 z, 

L’. Any two elements x and y have a l.u.b. x wV y, 

L”. Any two elements x and y havea gl.b. x A y. 

We recall that in any lattice,’ the three relations = y, x ~ y = y, and 
x wy = x are mutually equivalent; indeed, this is even true in any “partially 
ordered system”’ satisfying P;-P;. It follows that an automorphism with re- 
spect to one of the relation or operations =, ~, A is necessarily an automorph- 
ism with respect to all three. Hence we get as a corollary of Theorem 1, 

THEOREM 4. Left-homogeneity is equivalent to either of the dual left-distributive 
laws 


(1) a+(vy)=(@+2) (a+ y), 
(1’) at (x ny) = (a@+2z) \(@+t+y). 
right-homogenetty to etther right-distributive law (I, P,P). 


? 
= 
= 





* The terminology and notation are identical with that of the author’s book ‘‘Lattiee 
theory,” New York, 1940, although scant use will be made of the theorems proved there. 
* Discovered by Dedekind and independently Freudenthal; see footnote 13. 








rela- 


mM is 
an 
The 


t of 
+ a, 


ely, 
eS 


~a) 


tion 


sual 


and 


ally 
re- 


itive 


tiee 


LATTICE-ORDERED GROUPS 301 


(In case L’-L” do not hold, the existence of the join (meet) on one side of 
an equation is intended to be equivalent to the existence of that on the other.) 

We can prove from the left- and right-distributive laws just stated, and finite 
induction, also the following more general finite distributive laws 


a+ Vy; = V(ia+y;) and Va,+a= V(x,+ a), 
Vat+ Vy¥i = V@ity)), 
> CV 523) = Viw o> ii), 
and their lattice duals. 
TurorEM 5. Homogeneity is equivalent to the “monotonicity law’’: 
(2) xan and y2zy imply tr+y22' +’. (I, Pi , Ps) 
Proor. Applying homogeneity twice, we get 
S+t+yerct+y az t+y;, 


whence (2) follows by P;. Conversely, assuming P; , we get as special cases of 
(2)c¢+y2a+y andxi+y2z2' + y, implying homogeneity. 

Again, a permutation of the elements of a partially ordered system is a dual 
automorphism if and only if it interchanges the operations ~ and AH. Hence, 
from Theorem 2, we get 

THEOREM 6. Homogeneity is equivalent to the laws 


(3) a—(«ny)tb=a@-—x+b) (a—-yrtd), 
(3’) a—(evy)+tb=(@—x+b) .(@—yrtbd). (I, PrPs) 


We note as a special case 
(4) LAYy= —(-% Vv —y) and dually. 


From this we see that the lattice postulate L’’ is redundant, in the sense that it 
is implied by I, II, Pi-P3, and L’. 


5. Stone’s postulates 


But now P,—P3 and L’ are equivalent by pure lattice theory to the assertion 
that our system admits an idempotent, commutative and associative operation 
zy, in which z = y means « ~ y = x. Hence splitting (1) in two parts 
(right- and left-translations), we get as a corollary of Theorem 4 and the re- 
dundance of L”, 

THEorEM 7 (Stone’). An l-growp may be defined as a group, with a second 





” Stone assumed the group to be commutative, in which case one of the distributive laws 
(1”) ean be omitted (ef. Stone, op. ett.). 














302 GARRETT BIRKHOFF 


binary operation — which is idempotent, commutative, and associative, and satisfies 
the distributive laws 


(1”’) a+(avy)=(a+z2z) (@+t+y) 
(ce vy) +b=(e+b) (y+ d). 


It is a curious fact that, in virtue of the duality principle, substitution of 
-~ for ~ in the above system of postulates should also define an 1-group! 

Not only can we delete L” from our list of postulates, but we can even weaken 
a 

Derinition. By the positive part a* of an element a of an l-group, is meant 
aT 0; a =a A0 is dually called the negative part of a. 

Using right-homogeneity, we get 


(5) avb=(b-—a)t+a=(a—b)* +b. 
Combining (5) with the dualization law (3), we get 
(6) a Ab = —(-—a+ (a — b)*) = —(a — b)* +. 


There follows immediately 

TuHeorEM 8. The lattice hypotheses L'-L”’ can be replaced by the condition that, 
for all a, a ~ O should exist. (I, II, Pi-Ps) 

Also, a subgroup of an |-group is an [-subgroup if and only if it contains the 
positive part of each of its members. Substituting in Theorem 7, we get a 
further corollary. 

THeorEeM 9. An l-group may be defined as a group with a unary operation 
a — a* which satisfies 


(7) 0O* = 0, (8) c=c* — (—c)*, 
(9) the operation (a — b)* + b ts associative. 


A worth-while problem would be to find a less clumsy form of (9). In this 
connection, (a*)* = a* might be a useful partial substitute. One might also 
try setting the middle letter of the associative law equal to 0. 


6. Examples 


In the next two sections, we shall be using Theorem 3 as our main tool. 

First, we note that in order to describe an I-group G up to isomorphism, it is 
sufficient by Theorem 3 to describe the set G* of “positive” elements; indeed, 
this principle is independent of Postulate III. We shall now describe some im- 
portant examples of |-groups in this way. 

ExamPLe 1. Gis the additive group of real numbers; G* consists of all those 
which are non-negative. 

ExamMPLeE 2. G is the additive group of the integers; G* is defined as in 
Example 1. 

ExamPLeE 3. Gis the group of all positive rational numbers under multiplica- 
tion (the integer one is the group identity); G* is the set of all positive integers. 








tisfies 


on of 
1p! 
aken 


neant 


that, 


3 the 
ret a 


ation 


this 
also 


ol. 
it is 
leed, 


. jm- 
hose 
s in 


lica- 
yers. 





LATTICE-ORDERED GROUPS 303 


ExamP_e 4. G is the group of all vectors x = (z’, x’’) with two real com- 
ponents; @* contains x if and only if x’ > 0, or x’ = O and x” 2 0. 

ExaMp_LeE 5. G is the additive group of all real functions defined on the 
interval 0 S x S 1; G* consists of all those which are non-negative (satisfy 
f(z) = 0 for all z). 

ExaMPLe 6. G is the additive group of functions of bounded variation on 
0 <x < 1 with f(0) = 0; G* defined as in Example 5. 

ExaMpLe 7. G as in Example 6; G* consists of all “increasing” functions 
(functions for which x 2 y implies f(x) 2 f(y)). 

We shall now list some examples of non-commutative I-groups. The simplest 
example consists of the two-parameter non-Abelian Lie group, lexicographically 
ordered as follows. 

ExaMPLeE 8. G consists of all couples (x, y) of real numbers, where addition 
is defined by the formula 


(y+ (e',y)=(@t+2,eyt+y); 


G* consists of all those couples with x > 0 or x = 0, y = 0. 

ExaMPLE 9. G has three generators of infinite order, and defining relations 
a+b=b+a,atcec=c+b,b+c =c + a;G" contains ma + m’'b + ne 
if and only if nm > 0, or nm = 0 while m = O and m’ 2 0. 

ExaMpLE 10. G consists of the z > 0 of any ordered field or skew-field”™ 
(division ring) under multiplication; G* consists of all x = 1. 

For purposes of comparison, we shall also list various other examples which 
satisfy Postulates I-II and part, but not all, of Postulate ITI. 

ExaMPLe 11. G is any group; G* consists of the identity 0 alone. 

ExaMPLE 12. G is the multiplicative group of all non-zero elements of any 
algebraic number field; G* is the subset of all (algebraic) integers in G. 

ExaMPLE 13. G is the group of all elements of any integral domain of char- 
acteristic infinity under addition; G* is the subset of all sums of squares. 


7. Postulates of order reinterpreted 


It is trivial that the systems described in Examples 1-13 satisfy Postulates 
I-II if a = b is defined to mean (a — b) « G* (cf. Theorem 3). We shall now 
give simple tests for the validity of parts P:-P; of Postulate III. 

Lemma 1. The reflexive law P, is equivalent to (9). The group identity is posi- 
twe. (I, IT) 

For (a — a) €G* is equivalent to 0 ¢G* by group theory. This condition is 
evidently satisfied in Examples 1-12 above. 

Lemma 2. The transitive law P3, the monotonicity law (2) of Theorem 5, and 
the condition that 
(10) Any sum of positive elements is positive, are mutually equivalent. (I, II, Px) 

Proor. By Theorem 5, P; implies (5) modulo I, II, P:. Again (5) implies 

m : 


< 





™Cf. K. Reidemeister, ‘‘Grundlagen der Geometrie”’, p. 40. 














304 GARRETT BIRKHOFF 


the closure of G* as the special case a => Oandb =Oimplya+b2=>0+4+0+69, 
Finally, since (a — b) + (b — c) = (a — 0), the closure of G* implies P; as in 
Theorem 3. 

Corotuary. The positive elements of any l-group form a semigroup.” 

It is also a corollary that P; holds in Examples 1-13 above. 

Lemma 3. The antisymmetric law P2 1s equivalent to asserting that (11) a and 
—a are both positive only if a = 0. (I, ID) 

Proor. If (x — y) «G* and (y — x) eG imply (x — y) = 0, then P holds, 
Conversely, if Pz. holds, z 2 0 and —z 2 0 imply z = 0. 

It may now be checked easily that Pz holds in Examples 1-11 above, although 
not in Examples 12-13. 

A similer lemma, irrelevant here, is that the symmetric law (a = b implies 
b = a) is equivalent to asserting that a eG* implies —aeG*. From this and 
Lemmas 1-2 it follows that G* defines an equivalence relation if and only if 
it is a subgroup of G. (I, I) 

Finally, we can read off from Lemmas 1-2 and Theorem 8, the following not 
very satisfactory result. 

Lemma 4. The lattice hypotheses L’'-L”’ are equivalent to the following condition: 
(12) Given a, there exists a* such that u and (u — a) are both positive if and only 
if (u — a‘) is. (I, Il, Pi, Ps) 

From Theorem 3 and Lemmas 1-4, we conclude as the final theoretical result 
of this section, 

TuErorEM 10. An I-growp may be defined as a group G with a subset G* of 
““nositive’”’ elements which satisfies conditions (9)—(12). 

We also conclude that Examples 1-10 above are |-groups. In Examples 
1, 2, 4, 8, 10 this is true because the ordering is simple: for all a, either a or 
—a is in G; and so a’ is a or 0 accordingly. In Example 3, (m/n)* is the nu- 
merator of m/n when written in lowest terms; in Examples 5-6, f* is the “positive 
part” of f as usually defined (equal, for all z, to the larger of f(x) or 0); in Example 
7, f° is the “positive variation’ of f. In Example 9, (ma + m’b + nc)” is 
0 if n < 0, (ma + m’b + ne) if n > 0, and m*a + m’*d if n = 0. 


8. Fifth set of postulates 


We have characterized |-groups by four sets of postulates. Our definition 
was in terms of the group operation and a binary relation; Theorem 7 in terms 
of the group operation and a binary operation; Theorem 9 in terms of the group 
operation and a unary operation; Theorem 10 in terms of the group operation 
and a unary relation or set. We shall now give a fifth set of postulates for 
l-groups which, oddly enough, is in terms of the group operation alone! 

Evidently any |-group or other lattice has the following ‘“Moore-Smith” 
property: 





8 By a semigroup, we mean a system closed under an associative binary operation and 
having an identity element, in which the laws of concellation hold (a + z = a + y implies 
z= yand so doeszt+a=y-+a). 








+0. 
as in 


| and 
olds. 
ough 
plies 
and 
ly if 
’ not 


tion: 
only 


sult 


ples 
a or 


tive 
aple 
is 


tion 
rms 
oup 
Hion 

for 


th” 


and 
lies 





LATTICE-ORDEREB GROUPS 805 


Lemma 1 (Clifford’). The Moore-Smith property is equivalent to the assertion 
that 
(14) Every element is a difference of positive elements. (I, II, Pi, Ps) 

Proor. Assuming (13) with b = 0, we get a = c — (—a +c), wherec = 0 
and -a+ec= —a+(c—a)+a2-—-a+0+a=0. Conversely, if a = 
a —a’andb =0’ u b’’, where a’, a’, b’, b” are positive, then c = a’ + b’ 
exceeds both a and b.~ 

Now let A be any group with a relation = satisfying II, P,, P; and (14). 
We shall show that A is determined to within isomorphism by the semigroup 
A’ of its positive elements. The proof is related to the general theory of the 
extension of semigroups to groups. 

TuEorEM 11. In the notation of the calculus of complexes,a + A* = A* +a, 
for allaeA. 

Proor. Both sets consist of the elements containing a. 

Corotiary. Given aand xin A”, there exist a unique y « A* such thata + x = 
y + aandzin A* such thatz +a=a+z. 

The existence follows from Lemma 1; the uniqueness from the cancellation 
postulate defining semigroups. 

Now observe that A consists by (14) of the differences b — c of elements of 
£ , equated and combined by the rules 
(15) b-—c =)’ —e' of and only ff t, u exist in A* such thath +t =b'+u 
ande+t=c' +4, 

(16) (b-—c) + (b' —c’) = (6+ db’) — (c’ + €”), where c” ts the unique solu- 
tion of b’ +c" =c+D’. 

The sufficiency of (15) is clear; as regards the necessity, if we choose s 2 }, b’, 
t=—b+s,u = —b’+s,thenb +t=s =b’'+4u,whileifb —c=b' —c, 
thnb+t—t—c=b' +u—u-— cc’, whence —t —¢ = —u—c'andc’ + 
u=c+t. 

Clearly equations (15)—(16) describe the group structure of A in terms of that 
of At Moreover since 
(17) (b — c) €A* of and only if b = ¢ + t for some te A”, 
the lattice structure of A can also be described in terms of the group structure 
of A“. In fact,b — ec = b’ — cc’ if and only if t, u exist such that b + ¢ 2 6’ + u 
ande +tSe’+u. 

Conversely, suppose S is any semigroup in which, for alla,a + S= S+a 
(in multiplicative language, such that the left-multiples of any element are all 
right-multiples, and conversely). Then equations (15)-(16) may be shown to 
define a group.” We shall omit the details; one shows that (15) defines an 


(13) Given a, b, there exists c withe 2 aandc 2 b. 





*A. H. Clifford, ‘‘Partially ordered Abelian groups,’ Annals of Math. 41 (1940), pp. 
465-473, esp. p. 467. From the equation a = 1/4((a + 1)? — (a — 1)*), we see that (14) 
holds in Example 13. 

10 R. Baer has proved, in conversation, that a group can be constructed whenever, given 
aand b, x and y can be found such that a + z = b + y. 





306 GARRETT BIRKHOFF 


equivalence relation, which is a congruence relation with respect to the addition 
defined by (16), and relative to which the latter is associative, and gives any 
element b — c an inverse c — b. Furthermore, under (17), the set of positive 
elements forms a subset of A isomorphic with S. 

It follows, by Theorem 10, that we get an |-group provided (11)-(12) hold, 
But now (11) is clearly equivalent to 
(17’) Ifa+b=O0mS, thna=b=0. 
Finally, if any two elements of S have a l.u.b. with respect to the definition 
(17”") b = cif and only if b = c + t for some teS, 
then for any a = a’ — a’ of A(a’, a” €S) there exists a* = (@’ Va") - a”, 
which proves that condition (12) holds. There follows 

TurorEeM 12 (von Neumann”). An l-group may be defined as the extension 
to a group of a (multiplicative) semigroup S, in which (i) ab = 1 implies a = b = 1, 
(ii) aS = Sa for all a, (iii) any two elements have a least common multiple. In 
this group S consists of the positive elements. 

Coro.iary’. A commutative l-group may be defined as the extension to a 
group of a commutative semigroup S, in which (i) and (iii) hold. 

In fact, (i) is not really essential, if we are willing to introduce an equivalence 
relation. 


9. Distributive law; disjoint elements 


The following material belongs logically directly after §3, and is independent 
of the results of §§4-8 above. 
THEOREM 13. In any l-group, we have for all a, b, 


(18) a—(anb)+b=b va. 


Proor. Substituting a for x and b for y in formula (3), Theorem 6, we get 
(18) explicitly. 
Corotiary 1 (Dedekind’). In any commutative l-group, 


(19) a+b=(aAb)+ (arb) foralla, b. 


In Example 3, the modular law (19) specializes to the celebrated identity 
ab = (a, b) [a, b] of number theory. It also specializes, setting b = 0, to 

Corotuary 2. For anya,a =a’ +a. 
In words, each element a is the sum of its positive part and its negative part 
(so-called Jordan decomposition). 

TuHEorEM 14. Any l-group is a distributive lattice’. 





11 This result was communicted orally to the author. 

12 This result seems to have been known for ideals, but not in abstracto. The author has 
been unable to find a precise reference; ef. Krull’s ‘‘Idealtheorie.”’ 

13 Discovered in 1897; cf. Ges. Werke, Brunswick, 1931, vol. II, p. 133, formula (13); 
rediscovered by H. Freudenthal, ‘‘Teilweise geordnete Moduln,’’ Amsterdam Proc. 39 
(1936), p. 642. 

14 In the commutative case, discovered by Dedekind, op. cit., p. 135, formulas (18)-(19); 
rediscovered by Freudenthal, op. cit. p. 642, formulas (3.2). 








dition 
S any 
sitive 


hold, 


lence 


dent 


get 


tity 


yart 


- has 


13); 
. 39 





LATTICE-ORDERED GROUPS 307 


Proor. Bergmann has shown (“Lattice Theory’, p. 75) that a lattice is 
distributive if and only ifa ~x =a Ayanda Vt =a Vy implyz = y. 
But they imply by (18), 

a=(a nz) —at (ez Va) = (any) -—atly va) =y. 

TunoreM 15. In any l-group”, we have 
(20’) anb=0 and anc=0 imply an(b+c) =0, 

(20”’) avb=0 and avuc=0 imply ac(b+c) =0. 


First Proor. By hypothesis and formula (1’),c = (a ~b) +¢ = (a+e) A 
(b + c). Substituting, 


0O=aAnc=aAnl(atec) A (b+) =aAn(bt+o), 


since a S a +c. The second conclusion follows by duality. 
Seconp Proor. Since a, b, c are positive, clearly a ~ (b + c) 2 0. But 
by the distributive law (1’), 


=0+0=( ~ b) + (a ~ C) 
=~atanatcAnbt+anbt+cZ2an(b+0), 


proving (20’). Formula (20’’) follows dually. 

We can reword Theorem 15 in terms of the important concept of disjointness. 

DeriniTIon. T'wo positive elements a and b will be called disjoint—in symbols, 
a | b,—7f and only if a Ab = 0. 

In Example 3, this specializes to the concept of relative primeness. Theorem 
15 asserts that the set of positive elements disjoint to any a is closed under 
addition. Furthermore, if in Theorem 13 we assume a ~ b = 0 and apply 
the commutative law to b ~ a, we get the 

Lemma 1. Disjoint (positive) elements are permutable, 


(21) Ifa ~b =0,thna+b=b+a. 
Lemma 2. Ifb \~c = 0, then (b — c)* = band (b—c) = —c. 
Proor. By our preceding formulas, (b — c) V0 = (b Vc) —€ = 


b— (b Ac) +e —c = b, and dually. 

Lemma 3. If na = 0, thena = 0. 

Proor. Expanding by the distributive law (1’), n(a ~ 0) = na A 
(n — lha A (n — 2)a A+: Na AO. But if na ~ 0 = 0, this equals 
(n — lha A (n — 2)a A+++ Na A0 = (n— 1)(a 20). Now cancelling, 


we get a ~ 0 = 0, as desired. 





'® In the commutative case, observed by Dedekind, op. cit., p. 132; Proof 1 is Dedekind’s, 
Proof 2 is von Neumann’s. Observe that in the proof, no restriction need be put on the 
group operation (e.g., associativity); only distributivity is needed. Theorem 15 can be 
generalized (§§27, 28). 











308 GARRETT BIRKHOFF 


Combining Lemma 3 with its dual, we get 

TuroreM 16. In an l-group, every element is of infinite order except the identity, 

Another corollary is the fact that, in any commutative |-group, na = nj 
implies n(a — b) = 0,andsoa 2b. The author has been unable to prove the 
plausible conjecture that this remains true in any I-group. 

Lemma 4. The positive and negative parts of any element are disjoint; in 
symbols, 

(22) For any a, (a 0) A (—a — 0) = a* ~(-a) = 0. 

Proor. Clearly —(a ~ 0) = (—a ~ 0); hence the two left-hand terms are 
equal. But now by the distributive law, (a ~0) ~(—a — 0) = (a A —a) V0, 
so we need only show that —(a ~ —a) = —a Ya 2 0. But clearly 
avu-a a A —a; hence, subtracting, (@ ~ —a) — (a A —a) = 


| IV 


(a ~ —a) (—(a ~ —a)) 2 0, or 2a Y —a) 2 0. Now use Lemma 3 
with n = 2. 
10. Free 1-groups; absolute 
Now let a be any element of any I-group, and set b = a ~0,c = —a V0, 


so that b and ¢ are positive and disjoint, and a = b — ¢ (ef. Cor. 2 of Thm. 13 
and Lemma 4 above). Further, by Theorem 15 and induction, b 1 ne and 
mb 1 ne for all positive integers m and n. Further, by Lemma 1, b and ¢ are 
permutable, and so generate a commutative group, in which, for all integers 
m and n, 


(mb + nc) + (m’b + n’'c) = (m+m’)sb+ (n+en')e. 


Finally, (mb + nc)* is mb + nc unless m or n is negative, is 0 if m and n are 
negative, is (by Lemma 2 above and the disjointness of positive integral mul- 
tiples of b and c) mb if n is negative but m is not, and is ne if m is negative 
but n is not. 

It follows that the mb + ne form an |-subgroup, which is closed under lattice 
and group operations (Theorem 8), and is homomorphic with the I-group of all 
couples (m, n) of integers, in which (m, n) = 0 means that m = 0 and n 2 0. 
We shall (cf. §16) refer to this as the square of the l-group of the integers under 
addition. 

THEeorEM 17. The free l-group with one generator is isomorphic with the square 
of the l-group of integers under addition. 

In this group, a appears as the element (1, —1), a” as (1,0), and a as (0, —1). 
We can read off various corollaries from this representation. 

THEOREM 18. In any commutative l-group A, the correspondence a — n6 1s, 
for any positive integer n, an isomorphism of A onto an l-subgroup of itself. 

Proor. By pure group theory, it is a group homomorphism; by Theorem 16, 
it is a group isomorphism; by Theorem 17, we get (na)* = (n, —n)* = (n, 0) = 
na’, and so it is isomorphic with respect to the unary operation of taking the 
positive part; by formulas (4)-(5), it is therefore a lattice isomorphism. 

Derinition. By the absolute |a| of an element a of an l-group, is meant 
au —-a. 


mM 


—_ -_ 


= 








entity, 
= nb 
ve the 


nt; in 


ns are 
) 0, 
learly 
a) = 
ima 3 


~ 0, 
m. 13 
> and 
c are 
egers 


n are 
mul- 
ative 
ittice 
of all 
= 0. 
inder 
yuare 
—1). 
Us 18, 
n 16, 
x the 


neant 





LATTICE-ORDERED GROUPS 309 


THeoREM 19. In any l-group, we have identically: 


(23) If a € 0, then |a| > 0, while |0| = 0 
(24) |na| = |n|-!a| for any integer n, 

(25) la} =at-—a, 

(26) |a — b| = (a Vb) — (a Ad), 

(27) | (a Ub) — (a* —b)| S| a — a*| and dually. 


Proor. Formulas (23)-(25) are special cases of the representation of Theo- 
rem 17. Again, using (25), 


|\a —b| = ((a— b) YO) — ((a — b) 20) = ((@ Vb) — b) — ((a AD) — 5) 


from which (26) follows by group algebra. Finally, to prove (27), expand the 
left-hand side by (26) to getta ~b WY a* — (a Ub) AH (a* — b), whence by the 
distributive law, | (a ~ b) — (a* VU b)| = (a Va*) Ub — (a Aa*) Vb. 
This reduces (27) to the case a 2 a*, or a = a* + ¢ (¢ 2 0). But 
((a* +t) Vb) = a* C(b—2t) +t S (a* — db) + £, which takes care of this 
special case. 

Remark. In acommutative l-group, we can also prove the triangle inequality 
|a+b| < |a| + |b], but this does not seem to hold in general; also, the 
author has been unable to generalize Theorem 7.8 of ‘Lattice Theory” to 
l-groups which are not commutative. 

Concerning the free l-group with two or more generators, much less can be 
said. As an Abelian group, one can show that it has an infinite number of 
disjoint independent elements. On the other hand, using the three distributive 
laws (1), (1’), and that of Theorem 14, one can represent every element as a 
finite meet of finite joins of finite swms 


AV Ss ni”? Ok 
ii 
of the given generators a, and their inverses.’° 


11. 1-ideals 


It is well-known that the different homomorphic images of a given abstract 
algebra can all be found by enumerating its different congruence relations.” 
Also, with any group, the congruence relations correspond one-one with normal 
subgroups: to each normal subgroup N of a group G corresponds the congruence 
relation dividing G into the cosets of N. Therefore, the congruence relations 





‘6 The construction is identical with that used to prove Theorem 5.13 of ‘‘Lattice 
Theory.’ 

‘7 By a “congruence relation” on an abstract algebra with binary operations is meant an 
equivalence relation (i.e., reflexive, symmetric and transitive relation) denoted = which 
has, if * is any binary operation, the “substitution property”: (S) 4 = a’ implies d +b = 
a’ebandbea= bea’. 





310 GARRETT BIRKHOFF 


on an l-group are those decompositions into cosets of normal subgroups which 
have the substitution property (S) for the two lattice operations—or equiva- 
lently, by (4)-(5), make a = b imply a* = b*. But these are easy to describe, 

Dertnition. By an |-ideal of an l-group G, is meant a normal subgroup of G 
which contains with any a, also all® x with |x| S | a}. 

Clearly G and 0 are I-ideals of G; they are called improper |-ideals; all other 
l-ideals of G are called proper |-ideals. 

It is a corollary that any |-ideal is a convex l-subgroup in the sense of con- 
taining with any a and b, also —a,a + b,a ~b,a — bd, and every zx between 
a ~banda — b. Indeed, ifa ~b Sx Sa Wb, then 


laj]=a2v-«2S (a Vb) VU —-(a And) 
=avrbvT-b iY -a=|al|vc|b| S|a|+ |b]. 


THEOREM 20. The congruence relations on any l-group A are the partitions of A 

into the cosets of its different |-ideals. 
- Proor. If N is the set of elements congruent to 0 under a congruence rela- 
tion, thenae N and|x| <|a|implya ~ -aS 2 Sa 7 —a;hence0 ~0< 
x £0 ~¥Omod N, and soxeN. Conversely, if N is an I-ideal, then x = 2’ 
mod WN implies | (« ~ y) — (x ~ y)| S |x — 2’| by (27), and therefore 
xoy=x cymodN. Using left-right symmetry and duality, we see that N 
defines a congruence relation with respect to both lattice operations, com- 
pleting the proof. 

Lemma 1. If x S a + b, where x, a, b, are positive, then x = s + t, where 
0Osssa,0Stsb. 

Proor. Set? = « 7b; thenz = s + t, whereO0 S s S x — (x Ab) = 
xwb)—bS (a+b) —b=aand0 St S bd, as desired. 

THEOREM 21. The l-ideals and any I-group form a complete distributive sub- 
lattice of the (modular) lattice of all its normal subgroups. 

Proor. Clearly, any intersection of |-ideals is itself an l-ideal. To prove 
that the sum S + T of any two” l-ideals S and T is an |-ideal, suppose that 
seS,teT,and|x| <s+t. Then for somes’ eS,t#’e 7, —(s +28 =s' +1, 
and so 


Istil=(6+) V(' +t) Ss Ovrsvs) + 0rvrtv?). 


Hence x* < |x| S$ |s+t| <8” +0" (s’€S,t’€T). Using Lemma 1, we 


can show now that S + T contains x", likewise —xz , and sox = x’ +7. 
Therefore S + T is an |-ideal. 





18 The terminology is that of Stone (op. cit.); the concept is due to the author, who called 
1-ideals ‘‘normal subspaces’’; F. Riesz, ‘‘Sur la théorie générale des opérations linéaires”, 
Annals of Math. 41 (1940), pp. 174-206, called them ‘‘Families presque complétes.’’ Another 
good term would be ‘‘absolute (normal) subgroup.”” Kakutani uses l-ideals in a slightly 
different sense. ' 

19 From this it follows that the sum of any number of l-ideals is an l-ideal—by the 
general logical principle that for any “closure” involving only finite operations, the closure 
of any family of ‘‘closed”’ sets is the set-union of joins of finite subfamilies of ‘‘closed’’ sets. 





which 
uiva- 
cribe. 
. of G 


other 


-con- 
ween 


rove 
that 
+#, 


1, we 
rz. 


called 
ires’’, 
other 
ightly 


Vy the 
losure 
’ sets. 








— 
— 


LATTICE-ORDERED GROUPS 3 


It remains to prove that if S, 7’, and U are l-ideals, then S A~ - + U) 
(S ~ T) + (S A U). But since, in any case, S ~ (7 + U) 
(S AT) + (S A U) by the lattice-theoretic sum distelbutivs law, and x 
x’ — (—2 ), it suffices to show that every positive x in S ~ (7 + U) is in 
(SAT) +(S AU). ButreS ~ (T + U) means that xe Sand x = t+ u 
(te 7,ueU). Hence, as above, x = |t + u| St’ + uv”, where t’ eT, uw’ € U 
are positive. Therefore, by Lemma 1, x = t’ + wu’, where t’ = x JA t’ is in 
§ ,Tandu’ Sa nu’ isinS AU. This provesre(S . T) + (S AU), 
as desired. 

CorotLtary. The congruence relations on any l-group form a complete dis- 
tributive lattice.” 

Remark. If A is any commutative |-group, and T' is any |-ideal of an |-ideal 
S of A, then T is itself an ]-ideal of A: the property of being an I-ideal is thus 
hereditary. This follows because any subgroup of a subgroup is itself a sub- 
group, and by the transitivity of inclusion. However, as Example 9 illustrates, 
the same law does not hold for all non-commutative l-groups—essentially be- 
cause a normal subgroup of a normal subgroup of a group G need not be normal 
in G. 


IV i 


12. Disjoint |-ideals 


The following sections, through §20, will deal with non-commutative I-groups 
only incidentally. In the main, they will be devoted to obtaining a more 
complete picture of the structure of commutative |-groups, including a deter- 
mination of all possible structure lattices of finite length, and of all those 
‘simple’ commutative |-groups which have no proper |-ideals. 

DeriniTion. T'wo elements a and b of an I-group G are called disjoint if and 
only if |a| \|b| = 

THEOREM 22. The set {a}* of all elements disjoint from any fixed element a 
is a subgroup which contains with any b, all x satisfying |x| S |b|. 

Proor. By (23), {a}* contains 0; by Theorem 15, it is closed under addition; 
since | —b | = | b|, it contains with any element its group inverse; hence it is a 
subgroup. The second assertion follows by the monotonicity law. 

Corotiary. In a commutative l-group, the set of all elements disjoint from 
any fixed element is an I-ideal. 

Example 9 shows that, in the non-commutative case, the set need not be a 
normal subgroup. 

We note also, since {a}* cannot contain a unless a = 0, eithera - 0, {a}* = 
or {a}* is a proper l-ideal. This suggests the concept of a weak unit. 

Derinit1on. An element a of an l-group is called a weak unit” if the only 
element disjoint to it is 0. 





20 This is the “‘structure lattice’ of the l-group in the sense of the author, “‘On the 
structure of abstract algebras,” Proc. Camb. Phil. Soc. 31 (1935), p. 450. It describes the 
structure (in the usual sense) of the l-group. Theorems 20-21 are due to the author. 

*t The concept is due to Freudenthal, op. cit.; the useful terms ‘‘weak unit” and “strong 
unit” (infra) to Bohnenblust. We note that any separable Banach lattice has a weak unit. 





GARRETT BIRKHOFF 


13. Simply ordered groups 


A partially ordered set is called “simply ordered” when, of any two elements, 
one includes the other, so that 


P,. Given a, y, eitherx 2 yory2 z. 


This automatically implies L’-L”’. 

Derinition. A simply ordered group” is an l-group in which Ps holds. 

We note without proof the following trivial results. An l-group is simply 
ordered if and only if, for any a, either a or its inverse —a is positive. An 
l-group is simply ordered if and only if every subgroup is an l-subgroup. The 
structure lattice of any simply ordered I-group is itself simply ordered (a chain).” 

Derinition. Two l-ideals of an l-group are called disjoint if and only if their 
intersection is 0. 

It is easy to show that this is the case if and only if every element of the first 
ideal is disjoint from every element of the second. 

THEOREM 23. A commutative l-group has two disjoint proper l-ideals unless it 
is simply ordered. 

Proor. Unless the ]-group is simply ordered, it has an element a which is 
neither positive nor negative, so that neither a* nora is0. Hence S = {a*}* 
will be a proper I-ideal containing a but not a*. Moreover the set S* of all 
elements disjoint from all elements of S will contain a* but not a. Further- 
more, being an intersection of |-ideals, it will be an l-ideal. Finally, every 
element of S is disjoint from every element of S*. 

Coro.tuary. The structure lattice of a commutative [-group A is simply ordered 
af and only if A is simply ordered. 

Example 9 shows that the hypothesis of commutativity is essential in the 
preceding results. 

Dicression. We have seen (Theorem 16) that in an ]-group, every element 
is of infinite order. We shall now show that, in the commutative case, this is 
the only group-theoretic restriction implied by being an 1-group. 

TuroreM 24 (F. Levi"). Any abstract commutative group whose elements are 
all of infinite order, is the additive group of a simply ordered |-group. 

Proor. Let A be any Abelian group without any element of finite order 
except the identity. By a well-ordered rational basis for A, we mean a well- 





In fact, if {zi} is any everywhere dense countable set of positive elements, and 
Ai = 1/2” || a; || for all 7, then e = = jz; is a weak unit. 

22 Often called an “‘ordered group’’; this is consistent with the terminology ‘‘semi- 
ordered group”’ for what we have called a ‘“‘partially ordered group.” 

*3 For if the l-ideal S contains an element not in the l-ideal 7’, then the absolute of this 
element must exceed (not being included in) the absolute of every element of 7’, so that 
2 & &@. 

*4“Arithmethische Gesetze im Gebiete diskreter Gruppen,’’ Rendic. Palermo 35 
(1913), pp. 225-236. 








ments, 


simply 
» An 

The 
ain), 
if their 


e first 
ress it 
ich is 
{a*}* 
of all 
rther- 
every 
rdered 
n the 


-ment 
his is 


ts are 


order 
well- 





, and 
6 } 
semi- 


of this 
o that 


no 35 





LATTICE-ORDERED GROUPS f 313 


ordered (finite or infinite) subset of elements a. of A such that every non-zero 
element of A is a finite rational. combination maaa) +--+: + nae) 
(a(1) << ++: < a(r)) of the az, while >> naa) = 0 implies every n; = 0—or 
equivalently, > (m/ni)aai) = 0 implies that every (m,/n;) = 0. The exist- 
ence of a well-ordered rational basis can be proved directly by transfinite induc- 
tion, just as in the case of vector spaces. 

Moreover relative to such a basis, any element of A not the identity may be 
called positive or negative according as its first non-zero coefficient is positive 
or negative. This “lexicographic” ordering of A clearly defines from it a simply 
ordered group (commutative I-group). 

CoroLLaRY. A commutative group 1s the additive group of an l-group if and 
only if it ts without elements of finite order except the identity. 


14. Archimedean I-groups 


A gross way of comparing the magnitude of elements of |-groups is given by 
the following 

DEFINITION. An element a of an l-group is called incomparably smaller than 
a second element b (in symbols, a < b) af and only if na < b for any integer n. 

Otherwise stated, a < b means that b is an upper bound for the entire cyclic 
subgroup generated by a. Thus in Example 4, (0, 1) < (1, 0). It is easily 
verified that the relation < is antisymmetric and transitive; it is closely related 
to the concept of an Archimedean |-group. 

DeriniTion. An l-group is called Archimedean if and only if a K b implies 
a= 0. (I, II, P,-P3) 

The independence of the Archimedean property just stated from the lattice 
property L’-L” is illustrated by the easily proved fact that any subgroup of an 
Archimedean I-group is itself Archimedean with respect to the same order rela- 
tion, whether it is an I-subgroup or not. 

The Archimedean property can be formulated in other ways. It amounts to 
asserting that the I-group has no bounded subgroups except 0. It is equivalent 
to requiring that if the set of all positive multiples of a has an upper bound, 
then a < 0 (Clifford). In the case of I-groups, using Cor. 2 of Thm. 13, it is 
equivalent to the apparently weaker requirement that if a > 0, then the se- 
quence a, 2a, 3a, --- has no upper bound. 

In a simply ordered group, the Archimedean property is thus equivalent to 
the traditional condition that for any e > 0 and any b, ne > 6 for all sufficiently 
large positive integers n. This means that if we let U denote the set of all 
rational numbers m/n such that nb = me, and L the set of those such that 
nb < me (n positive), we get non-void sets. Moreover L and U together 
include all elements (by P,), and have at most one element in common. Hence 
they are the two halves of a Dedekind cut. Again, no two distinct elements b 
and b’ can determine the same cut, or we would have (b — b’) <e. Finally, 
by the monotonicity law (2), addition of elements is isomorphic to the addition 
of cuts. We conclude 





314 GARRETT BIRKHOFF 


THEOREM 25. Any simply ordered Archimedean l-group is isomorphic to a 
subgroup of the additive group of all real numbers, and so is commutative. 

THEOREM 26. An Archimedean I-group may have a non-Archimedean homo- 
morphic image. 

Proor. Consider the ]-quotient-group of the I-group of all functions on the 
interval 0 S x < +, modulo the l-ideal of bounded functions. In this, 
z’ >0,yeta’<« a’. 

Dicression. We have seen that in any ]-group, for any element a, the equa- 
tion nz = ma (n # 0) has at most one solution, which we can denote (m/n)a 
if it exists. It is worth remarking now that in any Archimedean 1-group, we can 
define uniquely scalar products \a of a by any real number X. To see this, 
suppose a positive; there is at most one x such that (m/n)a < z for all m/n < i 
and (m/n)a > xforallm/n > x. (If two, and 2’, then z — x’ Ka.) Thisz 
we may denote \a, and prove that, whenever all terms exist, the usual laws of 
the vector calculus hold. 


15. Strong units: principal 1-ideals 


We have just seen that in any Archimedean simply ordered I-group, to any 
e > 0 and b corresponds a positive integer n such that ne > b. This may be 
generalized. 

Derinition. By a strong unit of an l-group A, is meant” an element ¢¢ A 
such that for any b € A, ne > b for some positive integer n. 

Thus a strong unit must be positive. Many ]-groups do not have any strong 
unit. For example, the l-group of all continuous real functions on the domain 
0 S « < + has the weak unit f(z) = 1 but no strong unit; this is a weak 
corollary of the Theorem of du Bois-Reymond.” On the other hand, in the 
1-group of all bounded real functions on any domain, the function f(z) = lisa 
strong unit. We also note | 

Lemma 1. Any strong unit is a weak unit. 

Proor. For any e,e ~ a = 0 implies ne ~ a = 0 for all e (Thm. 22). But 
if eis astrong unit, ne > afor some n and soe ~a = Oimpliesa = ne ~a = 0, 
whence e is a weak unit. 

Even in |-groups without strong units, l-ideals may have strong units. In 
fact, in any commutative I-group, every positive element is a strong unit for an 
appropriate 1-ideal. 

TuroreM 27. (F. Riesz.”) In a commutative l-group, for any a > 0, the 
set J(a) of all b such that |b| S na for some positive integer n forms an |-ideal 
having a as strong unit. Moreover J(a) is the smallest l-ideal which contains a. 





* This result is due to H. Cartan, “Un théoréme sur les groupes ordonnes,” Bull. Sci. 
Math. 63 (1939), 201-205. 

6 The concept goes back to Archimedes; the term to Bohnenblust. 

*7 Cf. for instance, G. H. Hardy, “‘Orders of Infinity,’ Cambridge Tracts, 2d ed., 1924, 
p.8. In this example, our relation a < 6 is practically the usual relation f = 0(g).- 

28 F, Riesz, op. cit., p. 188. 








vomo- 


n the 
this, 


>qua- 
»/n)a 
e can 
this, 
<A 
his z 
vs of 


any 
y be 


2€A 
rong 
nain 
veak 


the 
is a 


the 
deal 


Sci. 


924, 





LATTICE-ORDERED GROUPS 315 





Proor. If |b| < ma and |c| < na, then clearly |b + c| S< (m + n)ja; 
while if |b | < ma and |x| S |6|, then |x| S ma; hence J(a) is an |-ideal. 
Obviously, a is a strong unit of J(a). Finally, any |-ideal containing a must 
contain every na and so all b with |b| S na. 

CoroLLaRy. Any commutative non-Archimedean l-group has a proper l-ideal. 

For if a < b for some a ¥ 0, then J(| a |) is an ]-ideal which fails to contain b, 
yet contains a # 0, and so is proper. 

Derinition. An l-ideal of an l-group will be called principal if and only if it 
has a strong unit. 

TuroreM 28. If the structure lattice of a commutative l-group has finite 
length r, every l-ideal is principal. 

Proor. Let J be an ]-ideal of such an l-group A. The case J = 0 is trivial. 
If J > 0, choose any a: ¥ Oin J and form J(|a|). If J > J(| a |), choose 
any a; in J but not in J(| a: |) and form J(| a: | + | a2 |). After repeating this 
process at most r times, we will get a principal |-ideal equal to J. 

TurorEM 29. Jn any commutative l-group A, the principal l-ideals form a 
topologically dense sublattice of the structure lattice of A. 

Proor. It can be proved easily that 


J(a Ab) = J(a) A J(b) and J(a+b) = J(a) + J(b), 


hence they form a sublattice. This sublattice is dense in the structure lattice 
of A, since any |-ideal J is the supremum (in fact, set-union) of the finite joins 
V;«-J(a;) of the principal ]-ideals contained in J, and these form an ascending 
directed set of principal l-ideals which thus converges to its supremum in the 
sense of Moore-Smith. 


16. Extension problem 


It is natural to say that an I-group is simple if and only if it has no proper 
l-ideals—or, equivalently, no proper congruence relations. Analogy with pure 
group theory then suggests the program” of first determining all simple |-groups, 
and then showing how the most general I-group whose “structure lattice’’ is of 
finite length can be built up from its simple quotient-l-groups. 

The first problem has been solved in the commutative case. Indeed, a simple 
commutative l-group must be simply ordered (by Theorem 23) and Archi- 
medean (by the Cor. of Thm: 28). Hence (by Theorem’26) we have 

TuEorEM 30. The only commutative simple l-groups are the subgroups of the 
additive group of real numbers. 

The second problem involves in particular the specific task of enumerating 
all the l-groups having a given ]-ideal J and |-quotient-group A/J (“Extension 
Problem”). While not attempting a complete solution of this, some frag- 
mentary results may be stated. 





sq The logical outline is the same, but the technique is very different. Cf. O. Schreier, 
‘Uber die Erweiterungen der Gruppen,” Monats. Math. u. Phys. 34 (1926), p. 165, and 
Hamb. Abh. 4 (1927), pp. 321-346. 





316 GARRETT BIRKHOFF 


Given two l-groups S and 7, one can form the |-group ST of all couples 
(s, t) (se S, te 7), where both the group operation and the lattice operations 
are performed on the S-components and 7-components independently, so that 


(s, t)o(s’, t’) = (ses’, tot’) 


where o is +, A, or ~. This is the direct union of S and T in the sense of 
universal algebra; we shall call it the cardinal product ST of S and T. The 
elements (0, ¢) form an |-ideal of ST’ = A isomorphic with 7, and the |-quotient- 
group A/T’ is isomorphic with S. Hence the extension problem always has at 
least one solution. 

One can also form the lexicographic or ordinal product” SoT of any two 
l-groups. This consists of the couples (s, t) (se S, t¢ 7’) just as before. But 
the set of positive elements consists of those couples (s, f) with s > 0 or s = 0 
and ¢t = 0, instead of those with s = 0 and t 2 O as in the case of cardinal 
products. In any case, So7' is a partially ordered group. If S is simply 
ordered, it is an l-group, in which the elements (0, ¢) form as before an 1-ideal 
isomorphic with 7’, whose |-quotient-group is isomorphic with S. Hence if S 
is simply ordered, the extension problem has at least two solutions. 


17. Direct decompositions 


In the cardinal product ST = A, both S and T correspond to I-ideals. More- 
over they correspond to complementary |-ideals, in the usual sense that 
S AT =0O0and S + T = A. Just as in the case of pure group theory, the 
converse also holds. 

THEOREM 31. An l-group A is tsomorphic to the cardinal product ST if and 
only if it contains complementary l-ideals isomorphic with S and T respectively. 

Proor oF Converse.” Suppose A has lideals S and T. Then by group 
theory, each element ae A has a unique representation a = s + t(seS,teT), 
while group operations are performed on the S- and T-components independ- 
ently. As regards order, s + ¢ S s’ + # if and only if 


(-s'+s)S(-d) S|t —#], 


whence (—s’ + 8s) S |t’ — t| A|-s + 8’|e7T A S = 0, and likewise 
t—t <0. This means s S s’ andt S ?, q.e.d. 

From the preceding result, Theorem 21, and the general theory of distributive 
lattices, we obtain just as in “Lattice Theory,’’ Theorem 5.15, the following 
corollaries. 

THEOREM 32. Any two representations of an l-group as a cardinal product have 


a@ common. refinement. 





8° For the general significance of cardinal and ordinal products, ef. the author’s article 
“‘Generalized arithmetic,” to appear in the Duke Journal of Mathematics. It is shown 
there that the ordinal product of two lattices is itself a lattice if and only if the left-factor 
is simply ordered, or the right-factor has universal bounds. 

%! For a brief proof, relying more heavily on principles of universal algebra, cf. also 
‘Lattice Theory,” p. 110, below Theorem 7.11. 








iples 
tions 
that 


se of 
The 
ent- 
ls at 


two 
But 


‘inal 
uply 
deal 
if § 


ore- 
that 
the 


and 
oup 


T), 
nd- 


vise 


tive 
ing 


Lave 


icle 
own 
stor 


also 





LATTICE-ORDERED GROUPS 317 


 Corottary. If the structure lattice of an l-group A has finite length, then A 
has a unique representation as the cardinal product of indecomposable factors. 


18. Main structure theorem 


In the present section, we shall show that the structure of a commutative 
-group is of a very special kind. Indeed, by Theorem 23, any commutative 
|-group in which the 1-ideal 0 is meet-irreducible (or “‘prime’’), is simply ordered. 
From this (cf. footnote 23) we conclude 

Lemma 1. The structure lattice of a commutative l-group in which the |-ideal 0 
is meet-rreducible, is a chain. 

But now if J is any l-ideal of an I-group A, the |-ideals of A which contain J 
form a lattice isomorphic with the structure lattice of A/J; indeed, this is a 
principle of universal algebra, holding for all congruence relations. Combining 
this result with Lemma 1, we get 

Lemma 2. The elements of the structure lattice of any commutative l-group 
which contain any meet-irreducible element, form a chain (simply ordered set). 

If we apply Lemma 2 to the general representation theory of finite distributive 
lattices (“Lattice Theory,’”’ Theorem 5.3), we get a conclusive result. 

Any distributive lattice LZ of finite length may be described in terms of the 
partially ordered set X of its meet-irreducible elements a;. Every element 
ceLis the meet Aa; of the set S, of the meet-irreducible elements which con- 
tain c. Moreover, as in “Lattice Theory,’ Theorem 5.3, the correspondence 
c— §S, is a dual isomorphism between L and the “J-closed” subsets of X— 
ie., the subsets of X which contain with any a; all a; 2 a;. 

This clearly applies to the structure lattice of any l-group, provided it has 
finite length. Moreover if the l-group is commutative, Lemma 2 restricts X 
greatly. 

DeriniT1I0N. A partially ordered system X is called a semitree if, for any 
element a ¢ X, the set of all x S ais a chain; it is a tree af it has a least element 0. 
The dual of a tree (semitree) is called a root (semiroot).” 

We have shown that the meet-irreducible (“‘prime’’) l-ideals of any l-group form 
a semiroot. But now it is easy to show that any finite semiroot is the sum of 
the subsets contained in its different maximal elements: the elements under- 
neath its different maximal elements form components having no connection 
with each other (no common subelements or superelements). 

Consequently, either the structure lattice contains complemented elements 
(namely, the meets of the sets of elements under the different maximal meet- 
irreducible elements), or the set X of meet-irreducible elements has a J. In the 
first case, the l-group is directly decomposable, by Theorem-31. In the second 
case, the J-closed subset consisting of J alone is a least non-void J-closed subset, 
RL 

“The Hasse diagram of any “tree” looks like a tree, and that of a “‘root’’ like a root 
(tree upside down). Further, the graph of a “tree” (or root!) is a tree in the technical 


sense of the theory of graphs. G. Kurepa has studied roots extensively, under the name of 
‘tableaux ramifies.”’ 





318 GARRETT BIRKHOFF 


which thus corresponds under our dual isomorphism to a greatest proper |-ideal. 
We can state our result as follows. 

THEOREM 33. A commutative l-group whose structure lattice has finite length, 
either (i) is a cardinal product, or (ii) has a unique maximal proper l-ideal. 


19. Solution of extension problem 


We shall now show that a commutative I-group A with a unique maximal 
proper I-ideal J is a kind of mixed ordinal product of A/J and J. This will 
give us a method for constructing, by successive extensions, all commutative 
l-groups having finite structure lattices. 

First, an element a of A not in J must be either positive or negative. For 
consider the sum of the l-ideals generated by a* and a ; it contains a, hence is 
not contained in J, hence it is A. But this expresses A as a sum of disjoint 
l-ideals; by hypothesis, A is join-irreducible; hence one of the |-ideals is A and 
the other (being disjoint) is 0, and a* or a” is 0, as desired. 

Second, a is positive or negative in A according as it is positive or negative 
in A/J, since a homomorphism carries positive elements into positive elements 
and dually. Hence A is determined to within isomorphism by its group struc- 
ture, the order structure of J, and the order structure of A/J. The positive 
elements of A are those which have their (A/J)-component greater than zero, 
or have their (A/J)-component equal to zero and their J-component positive. 

This definition gives, conversely, from any abstract Abelian group A which 
has a lattice-ordered subgroup J and simply ordered quotient-group A/J, an 
l-group which may be called a mixed ordinal product of J and A/J. Clearly the 
mixed ordinal products of J and A/J correspond one-one to the solution of the 
group-theoretic extension problem of finding all Abelian groups A with a sub- 
group isomorphic with J and a quotient-group isomorphic with A/J. In case A 
is the direct union of J and A/.J, we get the pure ordinal product; otherwise, 
we get something different. 

Now by Theorem 33, and induction (cf. the last Remark of §11) on the 
length of the structure lattice, we get 

THEOREM 34. Any commutative l-group whose structure lattice has finite length 
can be built up from simple l-groups by forming successive cardinal products and 
mixed ordinal products. 

This result can be applied directly to vector lattices. It is known” that the 
additive group of real numbers is the only simple vector lattice. Moreover it 
can be shown that for finite-dimensional vector lattices, the only group-theoretic 
solution of the extension problem is given by the direct union. We conclude 

Coro.tiary 1. Any vector lattice of finite dimension can be built up from the 
group of real numbers under addition by repeated formation of cardinal and ordinal 
products. 





88 Mr. Murray Mannos, a graduate student at Harvard University, is writing a disserta- 
tion on vector lattices of finite dimension which includes this and many other results. 








deal. 


ngth, 


‘imal 
will 
ative 


For 
ce is 
joint 

and 


ative 
ents 
truc- 
itive 
ZeV0, 
tive. 
hich 
, an 
’ the 
f the 
sub- 
se A 
vise, 


the 


onaqth 
and 


, the 
er it 
retic 
de 

2 the 
linal 


erta- 





LATTICE-ORDERED GROUPS 319 


We can state this somewhat cabalistically, using the generalized arithmetic 
notation of the author, as 

Corotary 2. The most general vector lattice of finite dimension is *R # , where 
Rx denotes the additive group of real numbers, and Y denotes the most general 
semiroot.” 

Incidentally, the structure lattice of "R * is B’’, where Y’ denotes the semi- 
tree dual to Y, ordinal exponentiation is replaced by cardinal exponentiation, 
and B is the chain of two elements. 

Going back to Lemma 2 of §18, the discussion.of distributive lattices fol- 
lowing it, and using Corollary 2 for the converse, we get a final result. 

THeorEM 35. A lattice of finite length is the structure lattice of a commutative 
l-group, if and only af it can be written B”, where Y is the most general semitree. 


20. Subdirect decompositions 


We shall now consider the representations of commutative I-groups as |-sub- 
groups of cardinal products of smaller ]-groups—or, as we shall say for short, 
as subdirect products. 

Just as in the case of groups (cf. “Lattice Theory,” p. 52) it may be shown 
that the representations of an l-group as a subdirect product correspond one-one 
to choices of sets of l-ideals having 0 for meet. In the case of structure lattices 
of finite length, we can thus show that commutative I-groups are subdirect 
products of l-groups in which 0 is meet-irreducible, and hence (§18, Lemma 1) 
of simply ordered I-groups. We shall now show that the restriction to the case 
of structure lattices of finite length is unnecessary. 

Lemma 1. Let a be any non-zero element of a commutative l-group A. There 
exists an l-ideal J in A such thataeé¢J yet J/J is meet-irreducible in A/J. 

Proor. By transfinite induction, we can construct a maximal |-ideal J which 
does not” contain a. It follows that any l-ideal of A which properly contains J 
will contain J(a); hence J is meet-irreducible. We infer that the meet of all 
meet-irreducible l-ideals of A is 0 in any case; hence that A is a subdirect product 
of |-quotient-groups A/J in which 0 is meet-irreducible, and so which are simply 
ordered. 

THEOREM 36. Any commutative l-group is isomorphic with an l-subgroup of a 
cardinal product of simply ordered l-groups.*° 





* It has been pointed out to the author by A. H. Clifford and I. Kaplansky that Theorem 
34 and its corollaries may be looked on as generalizing to the lattice-ordered case, the basic 
results of H. Hahn (‘Uber die nichtarchimedischen Grossensysteme,’’ S.-B. Wiener Akad. 
Math.-Nat. Klasse Abt. IIa, 116 (1907), pp. 601-653) on the classification of simply ordered 
groups. 

** The construction is identical with that used by Stone in constructing prime ideals in 
Boolean rings; it has been used so often that it will not be repeated here. Since an l-ideal 
Jis meet-irreducible if and only if |a | ~ || « J implies |a| «J or | b | ¢ J, there is justifica- 
tion for calling the meet-irreducible l-ideals prime 1-ideals. 

bs This is closely related to Satz 14 of P. Lorenzen, ‘‘Abstrakte Begrundung der multi- 
plikativen Idealtheorie,” Math. Zeits. 45 (1939), pp. 5383-553. 










320 GARRETT BIRKHOFF 





A special problem is that of trying to make the simply ordered |-groups 
Archimedean, so as to get a representation by means of real functions. For this 
to be possible, the original l-group must certainly be Archimedean; this condition 
would also be sufficient if it were not for Theorem 26. Much work has been 
done in attacking special cases of the problem.”’ 













21. Effect of chain condition 


We shall now turn our attention to l-groups in which all bounded sets have 
l.u.b. and g.l.b. A special case is furnished by ]-groups which satisfy the chain 







7 condition. | 
| Derinition. An l-group will be said to satisfy the chain condition® if and 
only af 
(C) every non-void set of positive elements includes a minimal member. 






Any element which covers 0 will be called a prime. 

Lemma 1. Any two primes are permutable. 

This is a corollary of Lemma 1 of §9. It is a corollary that the primes generate 
an Abelian subgroup, consisting of all elements which can be expressed as sums 
mpi + +--+ + ngp, of a finite number of distinct primes. 

Now let a > 0 be given, and consider all those differences a — > nip; which 
are positive. By the chain condition, one of these must be minimal, and so 
cannot contain any prime q (otherwise a — (>> nip; + q) would be smaller). 
Again by the chain condition, every positive element b except 0 contains a prime, 
namely, some minimal x such that 0 < « S b. Hence our minimal difference 
must be 0, so that a = >> np; . 

But every element can be expressed as a difference of positive elements: 
c = c’ — (—c)* for all c; hence 

LremMa 2. Any element not 0 can be expressed as a sum of integral multiples 
of a finite number of distinct primes, as a = nypi + +--+ + spe. 

Putting Lemmas 1-2 together, we infer that our l-group is commutative. 
Now if we distinguish positive and negative coefficients, we get an expression 
for any a ¥ Oas 


















a= mpi t+ -++ + mp, — mg — +++ — Nee. (mi, Nj > O) 


Clearly a cannot be positive unless g; S mip; + --- + m,p, for all j. But 
since distinct primes are disjoint, by Theorem 15 q; is disjoint from > mpi ; 
hence a cannot be positive unless no negative coefficients occur. 

Lemma 3. In Lemma 2, a is positive if and only if every n; is positive. 















*7 Cf. F. Bohnenblust, op. cit.; 8. Kakutani, ‘“‘Weak topology, bicompact set, and the 
/ principle of duality,” Proc. Imp. Acad. Tokyo 16 (1940), pp. 63-67, Thm. 6; Stone, op. cit.; 
M. and S. Krein, Doklady 27 (1940), pp. 427-430; and K. Yosida, ‘‘On vector lattice with a 
unit,’’ Proc. Imp. Acad. Tokyo 17 (1941), pp. 121-124. 

8 Ore uses the word ‘‘Archimedean’’ to mean the same thing, but our terminology is 
; more common. In the simply ordered case, (C) implies that every non-void set of positive 
elements has a least member (well-ordering condition), so that the integers form the only 
simply ordered 1-group satisfying the chain condition. 











n0 


W: 
“ A 


Is: 
no} 


a [ 





oups 
this 
ition 
been 


have 
hain 


and 


rate 
sums 


hich 
d so 
ler). 
ime, 
ence 


nts: 
iples 


tive. 
sion 


> 0) 


But 
LiPi 3 


1 the 
 ctt.; 
ith a 


gy is 
sitive 
only 





LATTICE-ORDERED GROUPS 321 


It is a corollary that a is zero (positive and negative) if and only if every n; 
is positive and negative, which is absurd. It is a corollary that the representa- 
tion of Lemma 1 is unique; for if a had two different representations, their 
formal difference would give a representation of 0. We can summarize. 

TurorEeM 37. Let A be any l-group which satisfies the chain condition. Then 
4 is commutative, and each non-zero element of A can be expressed uniquely as a 
sum of integral multiples of distinct primes.” Such a sum is positive if and only if 
no coefficient is negative. 

It is a corollary that A is determined to within isomorphism by the cardinal 
number of the set of its primes. 


22. Application to ideal theory 


This suggests an approach to the so-called “fundamental theorem of ideal 
theory” quite different from the modern approach,” and much nearer to the 
classical one. Let F be any field, and let H be any subring of “integers” of F 
which contains unity. By an zdeal in F, we mean a subset which contains with 
any two elements their sum and difference, and with any element all its integral 
multiples. Multiplication of ideals is according to the usual definition. 

It is clear that the non-zero ideals form a lattice with respect to set-inclusion, 
which in many important cases can be proved by extremely general arguments 
to satisfy the chain condition.” 

It is also clear that multiplication of ideals is commutative and associative, 
and that ideal multiplication is distributive on addition (the lattice-join). 
Therefore we have all of the postulates of Theorem 7 satisfied except the exist- 
ence of inverses. 

It follows that, in the most important cases, in order to establish the unique 
factorization of ideals into primes, we need only supplement general arguments 
by proving that every ideal has an ideal inverse—or equivalently, that the product 
of every ideal by a suitable ideal gives a principal ideal. 


23. Completeness 


Many important |-groups are complete, in the sense of the following 

Derinition. An l-group A is called complete (c-complete) if and only if every 
non-vord (resp. countable) bounded set has a g.l.b. and a 1.u.b. 

Remark 1. By Theorem 2, the existence of g.l.b. implies that of l.u.b.; and 





* In the commutative case, this result is essentially well-known. Cf. for example M. 
Ward, “‘Residuated distributive lattices,’’ Duke Jour. 6 (1940), pp. 641-651; also A. Clifford, 
“Arithmetic and ideal theory of abstract multiplication,’’ Bull. Am. Math. Soc. 40 (1934), 
p. 329, Thm. 2. 

* For the modern treatment of E. Noether, cf. van der Waerden’s ‘‘Moderne Algebra,”’ 
ls:ed., vol. Z, pp. 98-102. For the classical treatment cf. D. Hilbert, ‘“Théorie des corps de 
nombres algébriques,’”’ Paris, 1913. Remarks much like ours are made on p. 13 of Krull’s 
“Idealtheorie.”’ 

“ Cf. van der Waerden, op. cit., §80. By a “‘positive’’ ideal, we mean one which contains 
H, which is an identity for multiplication. The “‘negative’’ ideals are thus the ideals which 
are integral, in the usual terminology. 








322 GARRETT BIRKHOFF 


using Theorem 1, one can even show that it is enough to require that every non- 
void set of positive elements have a g.l.b. 

Remark 2. The chain condition implies completeness. For if S is a non- 
void set of positive elements sq, then the finite meets V sq(i) include a minimal 
member a by the chain condition. But every sa ~ 4, being itself a finite meet 
and so not properly contained in a, will be a. Thus a is a lower bound for S; 
it obviously contains every lower bound. 

Next, let 2 be any element of any |-group A. If s is any upper bound for the 
set {nx}, then by Theorem 1 so are s + x and s — x. It follows that {nz} 
cannot have a least upper bound unless x 2 0 and x S 0. 

Lemna 1. The set of all integral multiples of a non-zero element cannot have a 
lu.b. (I, II, Ps) 

Coro.tuary. Unless A = 0, any l-group A contains a countable set without a 
least upper bound. 

It also follows that, if A is o-complete, the set nz cannot have an upper 
bound (or it would have a l.u.b.). In other words, 

THEOREM 38. Any o-complete l-group is Archimedean. 

Coro.tuary. If an l-group can be embedded group- and order-isomorphically 
in a complete l-group, then it is Archimedean. 

Conversely, Clifford (op. cit.) has proved that any commutative Archimedean 
l-group” can be completed by cuts in the sense of Dedekind-MacNeille, to give 
a complete commutative I-group. Combining, we have 

THEOREM 39 (Clifford). A commutative l-group can be embedded in a complete 
l-group if and only if it is Archimedean. (I, II, Pi-Ps , (14)) 


24. Infinite distributivity 


It was proved (Theorem 1) that in an |-group, any group translation is a 
lattice automorphism. Consequently, it carries infinite joins and meets into 
infinite joins and meets, respectively. The formulas expressing this fact appear 
as the infinite distributive laws 


a+ V%a= A(at Za) a+ Ate = A(a+t+ &a) 


(28) 
Vta +b = A(te + D) Atatb A (ta + b) 


Similarly, since every correspondence of the form x — a — z is a dual auto- 
morphism, we have the formal laws 


(29) a— Vt. = A(a— 2.) and dually. 
Now let v = Vaa. Then, for all a and a, 


0S (a Av) — (€@ Axa) SU — 2a by (27). 





# Actually, L’-L’”’ may be replaced for this purpose by the far weaker condition (14) 
(Moore-Smith property). In the present case, the cuts appear as so-called v-ideals; cf. 
Krull’s v-Gruppensatz, ‘‘Idealtheorie,’”’ p. 120. 





bo 


is 2) 
—_ 








per 


ally 


ean 
‘ive 


lete 


isa 
nto 
ear 


1to- 


(14) 





LATTICE-ORDERED GROUPS 323 


But A(v — ta) = 0 — Va = v — v = O by (29); and by what we have just 
seen, 0 < A[(a nv) — (@ A2a)] S A(v — xa); hence 


0= A[(a Av) — (@ A2,)] = (€ Av) — V(a Azza). 
Transposing, we get the first of the further infinite distributive laws 
(30) aA Via = Via Ante) and av Ata = A(a VU 2a); 


the second follows by duality. Summarizing, we have 
TuroreM 40 (Kantorovitch®). The infinite distributive laws (28)—(30) hold in 


any complete l-group. 


25. Closed |-ideals 


In a complete commutative l-group, the complemented |-ideals may also be 
characterized in terms of closure properties. To see this, let us define for any 
set S of elements of an l-group G, the polar“ S* of S as the set of all elements 
disjoint from every element of S. 

If S is a complemented |-ideal with complement 7’, then y « 7 implies, for 
all ze S, that 


lt] Alyll=|2| aly| S|2| and |y|. 


Hence |x| ~ly|eS A T = 0, and x 1 y, proving ye S*. Conversely, if 
z=a+y(xeS,yeT) isin S*, then 


0=|z| A|[z/=(2/+lyl) ale] =I2l, 


whence z = yisin JT. This shows 7 = S*; by symmetry, S = 7* = (S*)* 

But now for any subset 7’, the set 7* is an l-ideal by Theorem 22, provided G 
is commutative. Further, by (30), if G is complete, then 7™* is a closed |-ideal 
in the sense of the following definition. 

Derinition. An l-ideal J of a complete |-group G is called closed” if and only 
if J contains with any bounded subset {xq}, also V Xe. 

Remark. Since the correspondence x — —z leaves J setwise invariant and 
inverts order, it follows that J also contains Axz,. Further, since any |-ideal 
is convex (§11), closure in the sense of the preceding definiton is equivalent to 
topological closure in the intrinsic topology.*° 





*“Lineare halbgeordnete Raiume,’’ Math. Sbornik, 2 (44) (1937), pp. 121-168, esp. 
Theorems 10-21. Kantorovitch assumed commutativity, but this does not play an essen- 
tial role. 

“Tt follows from the general theory of relations (cf. ‘‘Lattice Theory’’, §32), since the 
relation of disjointness is symmetric and anti-reflexive, that (i) if we denote (S*)* by S, 
then the operation S > 3 is a closure operation, (ii) if we call S ‘‘closed’”’ when S = S, then 
any intersection of “‘closed”’ sets is itself closed, (iii) 0 is closed, (iv) the correspondence 
S— S* is a dual automorphism of the lattice of ‘‘closed’’ sets. 

“* Closed l-ideals are the “familles completes” of F. Riesz, op. cit., Riesz proved Theorem 
42 for principal l-ideals. Condition (ii) below shows the concept also specializes to that of 
a “y-idea]”’ (Krull). 

‘6 As defined on p. 32 of the author’s “‘Lattice Theory.”’ 








i 


Gye ee at 





Pear? ay 


324 GARRETT BIRKHOFF 


We have seen that any complemented |-ideal is the polar of its complement, 
and that the polar of any subset of a complete commutative I-group is a closed 
l-ideal; we shall now complete the circle of reasoning by showing that any 
closed |-ideal is complemented, yielding 

THEOREM 41 (Riesz). For any l-ideal J of a complete commutative /-group, 
the following assertions are equivalent: (i) J is complemented, (ii) J = (J*)*, 
(iii) J is closed. If (i) holds, then J* is the complement of J. 

CoMPLETION OF Proor. If J is a closed l-ideal of any complete |-group G, 
then for any positive a eG we can form the J-component a, of a, as 


ay = Vecy® AG= Vecsz>00% Aa. 


Since G is complete, and 0 S « Aa S a for all « 2 0, a; exists. Moreover 


since J is closed and every x ~ ais in J, a; isin J. Hence for all positive 
zeJ, since (2 + a,) is positive and in J, 


ay S (+ a;) AGS Vacs ze AA=a;. 


But now by the distributive law (1), 
(¢+a;)) Ka=2z2A(a@-—a,;)+4,, 


whence, cancelling, z ~ (a — aj) = 0. Thusa — a;isin J*. It follows that 
J + J* includes all positive elements a = a; + (a — a;) of G—and hence all 
elements of G by Cor. 2 of Thm. 13, so that J + J* = G. But evidently 
J A J* = 0, which shows that J is complemented with complement J*, as 
asserted in the Theorem. 

CoroLiaRy. Any intersection of complemented l-ideals of a complete commu- 
tative l-group is itself complemented. 


26. Weak units and direct decompositions 


Since any intersection of closed 1-ideals is itself closed, it is natural to try to 
describe explicitly the intersection of all closed l-ideals which contain a fixed 
positive element a—in other words, the closed |-ideal generated by a. 

We can answer this question (in complete commutative I-groups) by direct 
appeal to Theorem 41. Using condition (ii), we see that (a*)* is the smallest 
closed |-ideal which contains a; further, it is the largest closed ]-ideal having a 
for weak unit. We can also describe (a*)* in another way, using Theorem 27. 
Clearly any closed ]-ideal which contains a will contain all x such that x = 
Vna ~x = Ana — 2; but conversely, the set of all such x is a closed |-ideal 
containing a. It is of course the topological closure of the principal 1-ideal 
generated by a. 

Now let A be any |-group with weak unit e. If A can be represented as the 
cardinal product A; --- A, of smaller l-groups, then the components of ¢ in the 
different A; are disjoint elements whose sum is e. 

Derinition. By a decomposition of a positive element e of an l-group A, 1s 





mi 
an 


fol 
Th 


de 
A 
gel 
so 
of 


pre 
adc 
hy] 








at 
all 


as 


u- 





LATTICE-ORDERED GROUPS 325 


meant a set of disjoint elements e; whose sum is e. By a component of e is meant 
an element e’ such that’ e’ ~(e — e’) = 0. 

TurorEM 42. The components of any positive element of any l-group form a 
Boolean algebra. 

Proor. They are the elements x of the distributive lattice of all elements 
0 < « < e which have complements, by definition and Theorem 13. These 
form a Boolean algebra, by Theorem 6.2 of the author’s “Lattice Theory.”’ 

TuroreM 43. Let A be any complete commutative I-group with weak unit e. 
Then the direct decomposctions of A correspond one-one with the decompositions of e. 

Proor. We have already seen that the components of e under any direct 
decomposition of A create a decomposition of e. But conversely, let e = 
¢ +:+:+ + e, be any decomposition of e, and let A; denote the closed |-ideal 
generated by e;. Since the e; are disjoint, we will have e; > A; if i ¥ j, and 
so A; L A;—or, what comes to the same thing, A; ~ A; = 0. But the sum 
of the A; contains e; + --- + e,, and is a closed |-ideal; hence it contains 
A = 0* = (e*)*. . 

This completes the proof; we note in passing that the example of the ordinal 
product of the additive group of the integers, with the cardinal product of the 
additive group of the integers with itself (in symbols, Jo(JJ)), shows that the 
hypothesis of completeness is not redundant. 


27. Residuated lattices 


The concept of ]-group can be generalized in two ways: one can weaken either 
the group or the lattice postulates. The least essential group postulate seems 
to be the one requiring the existence of inverses. If this is dropped,we arrive 
at something very close to the usual concept of a residuated lattice.* 

Derinition. Let G be any (additively written) groupoid, or associative system 
with identity 0. If G is also a lattice, and satisfies 


(28) a+ Vte= Viat+2e) and Vtatb= V(t. +d), 


it will be called an 1-groupoid. In any |-groupoid, the left-residual a:b of b by a 
is defined as the join of all x such that x <a. The right-residual a::b of b by a 
is defined as the join of all y such that by S a. 

Clearly every 1-group is an ]-groupoid, in which a:b is a — b and a::bis 
-b+ a. Also, by (28), (a:b) + b S a and b + (a::b) S$ a. The concept 
of l-groupoid is not self-dual; in any l-groupoid we have the monotonicity law 
(2), but not the dual of (28), even for finite meets. 

Much of the importance of I-groupoids stems from 





“ The concepts just defined, together with Theorems 42-43, are due essentially to Freud- 
enthal, op. cit. 

* This concept, and (implicitly) that of l-groupoid, are due to M. Ward and R. P. Dil- 
worth (“‘Residuated lattices,’ Trans. Am. Math. Soc. 45 (1939), pp. 335-354, and ‘‘Non- 
commutative residuated lattices,” ibid. 46 (1939), pp. 426-444). The main contribution of 


= Present section is to show that the concept applies to important systems other than 
ideals. 








yee 





326 GARRETT BIRKHOFF 


TueoremM 44 (Ward). The ideals of any ring form an |-groupoid if inclusion 
is taken to mean set-inclusion and if ideal multiplication is taken as the group 
operation. 

We shall omit the proof, which is immediate. A special further property of 
ideals is x + y S x AW y; this is not a consequence of the postulates for an 
l-groupoid, and implies x < 0 ~ 2, or 0 2 2 for all z, as a special case. Con- 
versely, if every x S 0, thenz + y S$ 0+ y = y and similarly z + y Sz, 
whence x + y S x& Ay, for all a, y. 

Derinition. A residuated lattice is an l-groupoid in whichx +y Su ny 
for all x, y,—or equivalently, in which every element is negative. 

No I|-group is a residuated lattice. However, we have 

TuHEorEM 45. Let G be any l-group, and S any set of negative elements of G 
which contains 0 and is closed under + and V. Then S is a residuated lattice. 

Corotuary. The set of all negative elements of any l-group or l-groupoid is a 
residuated lattice. 

For instance, by Theorem 45, the non-positive rn-increasing real functions 
on any interval, the non-positive convex functions on any interval, and the 
non-positive subharmonic functions on any plane region, form residuated 
lattices.” 

THEOREM 46. An abstract lattice L is residuated when A is taken as the group 
operation, if and only if the dual of L is a Brouwerian logic. In this case, the 
residuation operation : specializes to the implication operation —. 

Proor: Compare the definitions given above with Theorem 8.4 of “Lattice 
Theory.” 

TuroreM 47 (J. W. Duthie”). The binary relations on any set form an 
l-groupoid, if the relative product is taken as the group operation, while the lattice 
operations are given their usual significance. 

We note also that the V -ideals of any commutative groupoid form a residu- 
ated lattice. 

Proor. The different postulates defining an l-groupoid are proved in 
Schroder’s “Algebra der Logik,” vol. III, esp. formula (29), p. 100, and formula 
(6), p. 79. The proof can be supplied by anyone familiar with the definitions. 

The relations form a Boolean algebra under inclusion; and it has been proved 
by Ward-Dilworth (op. cit., Thm. 7.4) that the only way to make a Boolean 
algebra residuated is to take lattice-meet as the group operation; hence we know 
in advance that relations cannot form a residuated lattice. 

We note that in every I-groupoid, all left-residuals a:a of elements with them- 





*® These and other function-theoretic examples of the same type were signalized in §133 of 
‘Lattice Theory,’’ where however the connection with residuated lattices was not remarked. 

5° Communicated to the author orally; this result was not mentioned by O. Ore in his 
Colloquium Lectures on relations. For the definitions of relative product, join, and meet, 
for binary relations, cf. A. Tarski, ‘Introduction to Logic,’’ New York, 1941, pp. 90-93, or 
E. Schroder, ‘‘Algebra der Logik.’”’ It is interesting that the conversion operator a — 4 
should act as an involution on the algebra of relations. 





ti 
fo 
of 


yi 


Ne 


gro 
nee 








ion 
up 


ons 
the 
ted 


up 
the 


ice 


an 
lice 


in 
ula 
ns. 
ved 
ean 
OW 


2m- 


3 of 
ked. 
| his 
eet, 
3, or 


y 


— a 





LATTICE-ORDERED GROUPS 327 


selves are idempotent; a:a@ + a:a = a:a. For by the definition of left-residual, 
we have 


(a:a) + (a:a) +a S (ata) +a Sa, 


whence (a:a) + (a:a) S a:a. Conversely, a + 0 = a, whence 0 < aa, and so 
(a:a) = (a:a) + O S (aia) + (a:a), completing the proof.—In residuated 
lattices, a:a = 0, and the result just proved is trivial. 

Finally (cf. Dilworth, op. cit), we can prove 


(20’’) avb=avwvcc=0 implies av (b+ c) =0 
in any ]-groupoid; in fact, the proof of Theorem 15 applies as it stands! 


28. Riesz’ Interpolation Property” 


The most basic of the lattice postulates (see §4) seem to be the reflexive law 
P; and the transitive law P3 ; in general, a system with a reflexive and transitive 
relation is called a quasi-ordered set. 

DEFINITION. A quasi-ordered group 7s a group G with a homogeneous reflexive 
and transitive relation. If the relation is also anti-symmetric (satisfies Ps), then 
¢ is called a partially ordered group (or semiordered group). 

It is well-known (“Lattice Theory’, Thm. 1.2) that in any quasi-ordered set, 
ifa — b is defined to mean that a 2 b anda S b, we get an equivalence rela- 
tion, and that we can consistently identify “equivalent’”’ elements to get a par- 
tially ordered set. It is easily shown that in a quasi-ordered group, the x ~ 0 
form a normal subgroup N, while the other equivalence classes form the cosets 
of N. This gives 

THEorEM 48. The algorithm of identifying a and b whenever a = b anda & b, 
yields from any quasi-ordered group a partially ordered group. 

Existence postulates such as the Interpolation Properties to be discussed 
below and the lattice postulates L’—L’’ apply to quasi-ordered groups just as 
well as to partially ordered groups; only uniqueness properties are lost. How- 
ever, by Theorem 48, no real generality is lost if we restrict ourselves to partially 
ordered groups.” 

Derinition. Let m, n be any cardinal numbers. A partially ordered set will 
be said to have the (m, n) Interpolation Property if and only if, given a1, +++ , Xm 
and yi,+*+,Yn, such that x; S y; for all i, j, we can find a z such that x; S 
2S y; for all i, j. 

Specta, Cases. The reflexive law makes the (m, 1) and (1, n) Interpolation 
Properties trivial. The (0, 2) Interpolation Property is the Moore-Smith prop- 
erty discussed in §8; it implies the (0, n) Interpolation Property for all finite n. 





*! The author is greatly indebted to conversations with George Mackey and John von 
Neumann for material of the present section; the basic ideas go back to F. Riesz, op. cit. 

* Partially ordered groups can be bizarre enough. For instance, consider the additive 
group of reai numbers, and let the ‘“‘positive’’ elements be those which exceed unity; ‘na > 0 
need not imply a > 0. 














328 GARRETT BIRKHOFF 


The (0, 8) Interpolation Property for all 8 is equivalent to the existence of a 
universal element I, and can hold in no partially ordered group except 0 (§22, 
Lemma 1, Cor.). Again, the (a, 8) Interpolation Property for all non-zero 
cardinals is equivalent to conditional completeness: the condition that every non- 
void set bounded above have a least upper bound, and dually. To see this, 
given any bounded set of elements x; , form the non-void set y; of upper bounds 
to the 2; ; then x; S y; identically, so that z will exist with x; S y; for all i, j; 
by definition, 2 = Vz;. Similarly, the (a, 6) Interpolation Property for all 
cardinals, zero included, is equivalent to completeness. 

But, algebraically speaking, the most interesting case is the (2, 2) Riesz Inter- 
polation Property. By induction, this implies every (m, n) Interpolation Prop- 
erty with m, n finite and not zero. It is clearly weaker than the lattice property, 
since if a; S y; for all 7, j, then 2; S Va; S Ay; S y; for all 7, j. 

For example (F. Riesz, op. cit.), the polynomials, and also the rational func- 
tions with non-vanishing denominator, on any bounded closed region, form par- 
tially ordered groups which have the Riesz Interpolation Property but are not 
lattices. 

TurorEM 49. The following conditions on any partially ordered group G are 
equivalent: 

(i) The Riesz Interpolation Property, 

(ii) The condition of Lemma 1, §11. 

If G is commutative, they are both equivalent to 

(iii) The condition that if a, + a2 = bi + be, and a; , a2, bi, be are positive, then 
2 


there exist positive elements cu, Cw, Ca, C2, such that ci; = a; and 
j=l 


2 
>> ci; = b;. (Riesz Refinement Postulate) 
t=1 


Proor. First, (i) implies (ii). For if0 S$ z,a,b S a+ 6then0O S 2, aand 
x — b S 2, a (transposing); hence if (i) holds, there exists s with O'S s < a, 
s S$ «whence x = s +t (t 2 0), andz —b S swhences+tiSuSs+b 
and so t S b. Conversely, (ii) implies (i). By right-homogeneity, it suffices 
to prove that if 0, < y, x + b then there exists s with 0,7 Ss Sy,x4+). 
But indeed,O Sx+bSy+bsincex S y;henceex+b=s+tOSsy, 
0<t<b),whencrex Sx+tSxe+b=s+tands22,sS2r+b. 

Finally, if G is commutative, then (ii) and (iii) are equivalent. For0 < 2s 
a+ b (a 2 0, b = O) is equivalent toa + b = x + (a + b — 2), where all 
four summands are positive. To say that under these circumstances « = s + ¢ 
(00 Ss S a,0 St S b) is equivalent to saying that (a + b — 2) = 
(a—s)+(b—2t) = s' +1, whereb = 8’ > 0,a = t' = O—whenee sg, s’, t, 
behave as ¢;; for (iii). 

THEOREM 50. In any partially ordered group which has the Riesz Interpolation 
Property, we know 


(20’) anb=0 and anc=0 imply anA(b+o =0. 


Ir 


the 





fa 
22, 
ero 
on- 
his, 
nds 
1d; 

all 


ler- 


nc- 
ar- 
not 


are 


hen 


ind 


ll 


Lon 








LATTICE-ORDERED GROUPS 329 


Proor. Suppose x S a,b +c. Then a and b are upper bounds to 0 and 
r—¢,sinceexz —¢c Sx Saandb — (x —c) = (b +c) —x 20. Hence an 
element can be inserted between 0, — canda,b. But sincea ~ b = 0, this 
element must be 0. Hence x — c S$ 0,2 S cas well asx S a, andzx = 0. 

By duality (20°) holds also; for the further study of commutative partially 
ordered groups with the Riesz Interpolation Property, with especial emphasis 
on the linear functionals on such groups, see F. Riesz, op. cit. 


29. Unsolved problems, general case 


We shall conclude this paper with a list of problems of varying degrees of 
interest and difficulty. For the purpose of classification, these will be divided 
into those which involve general |-groups, and those which relate primarily to 
commutative |-groups. 

ProsLeM 1. Show that na > nb for one positive n implies a > b. 

Succestions. This is easy in the simply ordered case, or if a and b are 
permutable (see §9, Lemma 3). 

ProsLEM 2. Show that if a > 0 and b > 0, then 


—a—-b+a+b<Ka-+b. 


In words, the commutator of a and 6 is incomparably smaller than a + 6b. 
SuecEstTion. If the commutator is in the center, then 


n(a +b) & —na — nb + nla +) =(3)(-a-b +040) 


Hence if the conjecture of Problem 2 can be proved, we have (a + b) 2 
1/2n(—a — b + a + b) for every even integer n, giving the desired result. 
This method, with the aid of finite induction, might be successfully applied at 
least to hypercentral l-groups. 

ProBLEM 3. Prove that every Archimedean |-group is commutative. 

This result would be a corollary of the result conjectured in Problem 2. 
Using Theorem 38, we would infer as a second corollary that every complete 
l-group was commutative. Hence to disprove the conjectures of Problems 3-4, 
it would be enough to find a complete non-commutative l-group, or an Archi- 
medean non-commutative I-group. 

Propiem 4. Prove that a complete |-group either satisfies the chain condi- 
tion or has at least the cardinal number of the continuum. 

Prostem 5. Find an I-group without proper |-ideals which is non-commu- 
tative. 

Succzstions. By Theorem 30, it would suffice to find a simple l-group which 
was non-Archimedean or not simply ordered. By Theorem 25, this is also 
hecessary, so that if the author’s conjecture is correct, either a non-Archimedean 
ora non-simply ordered simple l-group must exist. The author conjectures that 
the former is certainly the case. 

ProsLeM 6. Find all |-group orderings (homogeneous lattice orderings) of 





— 


a 





aa ~ ri A cy wary 


Tepe yok 


330 GARRETT BIRKHOFF 


the free group with two generators. Is the commutator-subgroup necessarily 
an |-ideal? 

SuaGEstion. See Problem 2. 

ProsBiEM 7. Find a necessary and sufficient condition that an abstract group 
be isomorphic with the additive group of an |-group. 

Suaarstion. By Theorem 16, it is necessary that every element be of in- 
finite order; by Theorem 24, this is sufficient in the commutative case; the author 
conjectures that it is also sufficient in the hypercentral case. 

ProsieM 8. Suppose that in an l-groupoid 0:(0:x) = x and 0::(O::2) = 2 
for all z. What can be inferred? 

Suacestions. The correspondence « — 0:2 will then be a lattice involution, 
so that the dual of (28) also holds. What about the commutative case? Will 


O:(¢ + y) = O:a + Ory? 
30. Unsolved problems, commutative case 


Any commutative group without elements of finite order whose cardinal 
number is at most that of the continuum, is isomorphic with an additive sub- 
group of the ordered group of real numbers under addition—proof by rational 
bases,—and so is isomorphic with an Archimedean |-group. 

ProsBLEM 9. Is every commutative group without elements of finite order 
isomorphic with the additive group of an Archimedean |-group? 

ProsieM 10. Find a necessary and sufficient condition that a commutative 
partially order group be group- and order-isomorphic with an additive subgroup 
of a cardinal product of simply ordered Archimedean l-groups—or equivalently, 
by real functions.” 

ProsieM 11. Given I|-groups B and C, reduce the problem of finding all 
l-groups A having an |-ideal J isomorphic with C and |-quotient-group A/J 
isomorphic with B to a problem in pure group extension, in the commutative 
case. 

This problem was implicitly solved in special cases in §19; the special cases 
B simple and C simple might well be attacked first. 

ProsLEM 12. Find all Lie l-algebras: Lie algebras over the real field which 
are vector lattices relative to a set of positive elements which is invariant under 
all inner automorphisms. 

Suacestions. Use the known classification of vector lattices with finite basis 
(Cors. 1-2 of Theorem 35). The author conjectures that a Lie algebra can be 
made into a Lie |-algebra only if it is solvable (or ‘‘integrable’’). 

PRosBLEM 13. Find all Lie I-groups in the large. 

The problem in the small is contained in Problem 12; the fact that all elements 
have infinite order should simplify it. 

ProsBLEM 14. Construct a theory of |-rings. 





*8 The work of von Neumann (unpublished), Stone (cf. footnote 35) et al. shows that any 
such Archimedean group is isomorphic with a homomorphic image of such a cardinal 
product. 











n, 
ill 


ill 
J 


ve 


ly 
al 


LATTICE-ORDERED GROUPS 331 


The only postulates known (Stone, op. cit.) cover only a very special case: 
subrings of cardinal unions of simply ordered I-rings, corresponding to rings of 
functions. Cf. also A. A. Albert, “On ordered algebras”, Bull. Am. Math 
Soc. 46 (1940), pp. 521-522. ; 

ProsLem 15. Find a more direct substitute for condition (8) in Theorem 9: 
ie., 2 simple condition or set of simple conditions on the operation a > at 
necessary and sufficient to make the operation (a — b)* + b associative. 


HarvarRD UNIVERSITY 








ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


OPERATOR METHODS IN CLASSICAL MECHANICS, II 


By Pavut R. Hatmos AND JOHN VON NEUMANN 
(Received December 23, 1941) 


Introduction 


The purpose of this paper is two-fold: to map all measure spaces for which 
this is possible on the unit interval, and to apply such mapping theorems to the 
study of ergodic measure preserving transformations with a pure point spectrum. 

“Mappings” between two measure spaces may be interpreted in two ways, 
as set mappings and as point mappings, and accordingly we give below two sets 
of necessary and sufficient conditions for the existence of a mapping from a given 
space to the interval. The first of these, the set mapping or algebraic iso- 
morphism theorem, seems to be known, and although it has never been explicitly 
stated in the literature there are many proofs of special cases of it on record. 
We give an explicit proof of it and use a construction of the proof in proving the 
second, point mapping or geometric isomorphism, theorem. This second theo- 
rem depends on the new concept of normal measure space: a seemingly artificial 
concept which is, however, useful for two reasons. First, it is purely measure 
theoretic (and not topological), in character, and hence is applicable to the 
measure spaces usually discussed in probability theory; second it is hereditary 
under all the usual operations on measure spaces (such as the formation of 
direct products, decomposition into direct sums, etc.). 

Using the concepts and results of the mapping theorems just described, and 
of the Pontrjagin duality theorem concerning compact and discrete abelian 
groups, we are able to show that every ergodic measure preserving transforma- 
tion with a pure point spectrum is isomorphic to a rotation on a compact abelian 
group. This is a “normal form” theorem for a certain class of measure pre- 
serving transformations and can be used to answer many questions, such as the 
existence of square roots, commutative transformations, etc., concerning such 
transformations. 

Although this paper is a continuation of an earlier work of one of us’ it is to a 
large extent independent of this earlier work. The proofs of the main theorems 
mentioned above are logically complete here; only in some of the applications, 
as for example in discussing the relation between point mappings and set map- 
pings, do we make use of the results of (I). 


1. General measure spaces; the algebraic isomorphism theorem 


Let X be any set, and 6X any Borel field of subsets of X; let m be a non 
negative, contably additive, finite measure defined on XC. The system 
{X, &C, m}, which we shall usually denote by X, or, if necessary to indicate its 





1 See John von Neuman, Zur Operatorenmethode in der klassischen Mechanik, Annals of 
Mathematics, vol. 33, (1932), pp. 587-642. In the sequel we shall refer to this paper as (I). 


332 








ich 
the 
um. 
LYS, 
sets 
ven 
iso- 
itly 
ord. 
the 
\e0- 
cial 
ure 
the 
ary 
. of 


und 
jan 
na- 
jan 
re- 
the 
ich 


0a 
ms 
ns, 


ap- 


10n 
em 
its 


s of 


(I). 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 333 


dependence on &XC and m, by X(X, m) is called a measure space. Sets Ee X 
are called measurable; we shall use also the usual terminology of the Lebesgue 
theory in describing functions as measurable, integrable, etc. A measure space 
is complete if every subset of a measurable set of measure zero is itself measurable 
(and has, of course, measure zero). Since it is always possible to extend the 
definition of m to a Borel field 9C’ > &C so that X(M’, m) is complete, we shall 
lose no generality, and gain somewhat in simplicity, by assuming completeness. 

In any measure space X we shall write 8 = $(9) for the Boolean algebra 
of measurable sets modulo sets of measure zero. We shall make use of the 
notations of set theory, (C, +, etc.) in B, and of the fact that we may consider 
m as defined on %. 

We discuss now the concept of separability in measure spaces. A Borel 
field XC (or a measure space X({XC, m)) is strictly separable if it contains a count- 
able collection of sets such that the smallest Borel field containing all of them, 
(the Borel field spanned by them), is $C itself. Two sub Borel fields, @ and &, 
of the Borel field 9C of measurable sets in a measure space X ({X, m) are equivalent 
if to every set E in either one of them there corresponds a set F in the other 
such that the symmetric difference (HE — F) + (F — E) has measure zero. A 
measure space is separable if there exists a strictly separable Borel field @ con- 
tained in and equivalent to X.’ A concept, which lies logically between separa- 
bility and strict separability, more useful than either of these, is proper separa- 
bility. A measure space X (9X, m) is properly separable if there exists a strictly 
separable Borel field @ C XC, such that to every E ¢$X there corresponds an 
Fe@ with E C F and m(F — E) = 0.2 We observe that this definition is 
self dual: by applying the condition to X — E we readily obtain a set F e@ 
with F C E and m(E — F) = 0. We shall make use of the fact that if X is 
separable (or properly separable) and @ is the strictly separable Borel field 
described in the definitions above then B(9C) = B(@). In the case of (properly) 
separable measure spaces it will be necessary to indicate in the notation the 
strictly separable Borel field used; we shall write X = X(%X, @, m). We shall 
call sets of @ Borel sets, and functions measurable (@) Batre functions. (A real 
valued function f(x) is measurable (@) if the inverse image under f of every real 
Borel set S, ie. the set {x | f(x) « S}, belongs to @.) 





*? This is not the usual form in which this definition is given. Cf., for example, J. L. 
Dobb, One—parameter families of transformation, Duke Mathematical Journal, vol. 4, 
(1938), p. 753. That our definition is, however, equivalent to the usual one is proved by 
Paul R. Halmos, The decomposition of measures, Duke Mathematical Journal, vol. 8, (1941), 
p. 887. We observe X is separable if and only if the Boolean algebra BN) has a countable 
number of generators. 

* The concept of proper separability, first introduced by W. Ambrose and 8S. Kakutani, 
Structure and continuity of measurable flows, Duke Mathematical Journal, vol. 9, (1942), 
pp. 25-42, is fundamental in measure theory. Although it is possible to give examples of 
separable but not properly separable measure spaces, these examples are all of a more or 
less pathological kind. One such example is the unit interval, with the Borel field of all 
sets of Lebesgue measure zero and their complements in the role of Xx. 





334 PAUL R. HALMOS AND JOHN VON NEUMANN 


A measurable set E in the measure space X({XC, m) is indecomposable if it 
contains no proper measurable subsets other than the empty set; an element 
E ¢ 8(9%) is an atom if it contains no proper subelements, other than 0, in B(X). 
A measure space is non atomic if $(9C) has no atoms: in other words if every 
measurable set of positive measure contains measurable subsets of smaller posi- 
tive measure. From the point of view of a study of the structure of measure 
spaces indecomposable sets and atoms are uninteresting: we shall generally 
assume that the former consist of exactly one point and the latter are absent. 
More specifically our assumption will be described in the following terms. 

A countable sequence, Ai, Az, --- , of subsets of X is a separating sequence 
if to every pair of points, « # y, we may find an integer n with re A,, 
yeX — A,. If there exists in X a separating sequence of measurable sets, 
an indecomposable set contains exactly one point. We shall now show that the 
assumption of the existence of a separating sequence of measurable sets has a 
similar effect on atoms. Let E be a set of positive measure which contains no 
measurable subsets of smaller positive measure. It follows that for each n one 
of the two sets, EA, , and E(X — A,) has measure zero and the other one has 
measure m(Z). By aslight change of notation we may assume m(EA,,) = m(E) 
forn = 1,2,---. If we write [[%_1 A, = A, then we have m(EA) = m(E); 
since, however, A can contain at most one point, this implies that for some 
point « e E we have m(E — x) = 0. In other words the existence of a measur- 
able separating sequence implies that the weight of an atom is concentrated at 
one point; if, for example, we assume that the measure of a point is always zero, 
we may infer that the space is non atomic. Since in a measure space, which has 
by definition finite measure, there can be at most a countable set of points of 
positive measure, and since their measure theoretic structure is clear, we shall 
generally assume non-atomicity explicitly. 

If X1(9C1 , mz) and X2({XC2 , m2) are measure spaces, a set isomorphism between 
X, and X»2 is a measure preserving isomorphism between the Boolean algebras 
BN) and BX2). More specifically a set isomorphism is a one to one mapping 
T from B(x) on B(N2) which is such that 


T(Xi — FE) = X. — TE, 
TC E,) _ : le TE,, : 
mE) = m(TE). 


If such a mapping T' exists, X; and X2 are set isomorphic. 

After one more comment on notation we shall be ready to state and prove our 
first result. Since the unit interval plays a fundamental role in our investiga- 
tions and is used as a yardstick with which to compare other measure spaces, 
we find it convenient to introduce a special notation for it. “We shall denote 
the unit interval by X, the collection of Lebesgue and Borel measurable sets by 








our 
iga- 
ces, 
10te 
: by 





OPERATOR METHODS IN CLASSICAL MECHANICS, IE 335 


X and @ respectively, and Lebesgue measure by *@. In our terminology 
¥ = X(X, Q, Mm) is a properly separable measure space.” 

TuzorEM 1. A necessary and sufficient condition that a measure space of total 
measure one be set isomorphic to the unit interval is that it be separable and non- 
atomic. 

Proor. Since the unit interval is separable and non-atomic and since these 
properties are evidently invariant under set isomorphisms, the necessity of our 
conditions is clear. To prove their sufficiency, let X(X, @, m) be the given 
measure space, m(X) = 1, and let A; , Az, --- be a countable sequence of Borel 
sets which span @2. We may assume (by adding a superfluous set to the {A,} 
if necessary) that }>*%.1A, = X. Then we may make correspond to every 
rational number 7, O S r S 1, a set B, such that 

(i) {An} and {B,} span the same field; 

(ii) r < s implies B, C B, ; 

(ii) []>.B, = B. ; 

(iv) [J], B, = 0; >> B, = X.’ 

We now define, for every real number a, 0 < a S 1, aset Ba by Ba = [J >a B,. 
It is clear that this definition of B, is consistent with its previous definition in 
case a is rational, and that the family of sets {B,} satisfies the conditions (ii), 
(iii), (iv), (where in (iii) and (iv) we extend the products and sums over an 
arbitrary countable set of real numbers r for which inf r = s in (iii), inf r = 0 
and sup r = 1, respectively, in (iv)). Moreover, condition (i) implies that 
B, ¢@ for all a and that the Borel field spanned by the B, is @ itself. 

Given now the family B, we may find a (uniquely determined) function f(z), 
defined for eX, 0 S f(x) S 1, for which {x| f(x) < a} = B, ; we may, for 
example, define 


(1) f(x) = inf {a|a € Ba}. 
The class of all sets of the form 
(2) f (BE) = {x | f(a) « B}, 


where # is an arbitrary Borel set in the unit interval, is a Borel field contained 
in @; since it contains all B, , and therefore all A, , it coincides with @. 

Let F(a) = m{x| f(x) S a} = m(B,) be the distribution function of f(x): 
F(a) is monotone non-decreasing from 0 to 1 as a ranges between 0 and 1, and 
is continuous from the right. (This much is always true, of an arbitrary distri- 
bution function.) In our special case we assert that F(a) is continuous. For 


ee 


‘In the sequel we shall sometimes use the notation X(N, Q, m) for the perimeter of the 
unit circle in the complex plane: it is clear that this space has the same measure theoretic 
structure as the unit interval. We shall always make it clear whether the symbol X has 
its real or its complex meaning. 

* Cf. (I), p. 602; see also J. L. Doob, Stochastic processes with an integral valued parameter, 
Transactions of the American Mathematical Society, vol. 44, (1938), p. 91. 





336 PAUL R. HALMOS AND JOHN VON NEUMANN 


if a = @ is a discontinuity of F(a), then {x | f(x) = ao} is a set of positive 
measure which therefore, (non-atomicity), has Borel subsets of smaller positive 
measure. Such a subset cannot be put in the form f ‘(£), contrary to what 
we have already proved. 

For any %,0 < % S 1, we define f(z) = inf {a | F(a) = Z}. It is well known 
(and easily verified) that {(%) is a strictly monotone increasing (not necessarily 
continuous) :function of %, which increases from 0 to 1 as % does, and which is 
continuous on the left. Moreover the distribution function of (2) is again F(a). 

For any Borel set E C X, consider the set f-'(Z): we assert that the collection 
of all sets of this form, (which clearly forms a Borel field), coincides with (@. 
This is true since the increasing character of f(Z) implies that every interval 
(0, ®) has the form f'(£), where & can even be chosen as an interval. 

Suppose that it ever happens that f"(#£:) = f-'(£:). (We shall now make 
use of the fact that for an arbitrary Baire function g(x), 0 S g(x) < 1, the 
correspondence & — g(£) = {x | g(x) € E}, is a homomorphism of @ into @, 
ie. that g (X — £) = X — g' (Bf), and g (fF, + & +---) = 
g (BE;) +g '(f2) +---). If we write ’ = (£, — F,) + (#, — F)) for the 
symmetric difference between £,; and EF, , then it follows from the equality of 
the distributions of f(x) and f(%), that m{f"(2)} = m{f"(B)} = 0. Con- 
versely, of course, f (£,) = f '(£2) implies the same result. 

Consequently the correspondence f (£) <= f'(£) is one to one, not neces- 
sarily between @ and (?, but certainly between 8 = B(9C) = B(Q) and F = 
B(X) = BA). It is clear that this correspondence preserves measure, and 


the homomorphic nature of the mappings £ — f'(£) and E — f'(B) shows 
that it is also an algebraic isomorphism. 
This concludes the proof of Theorem 1. 


2. Normal spaces; the geometric isomorphism theorem 


If Xi(X1, mi) and X2(%X2, m2) are measure spaces, a point isomorphism 
between X, and X2 is a one to one mapping from almost all of X; on almost 
all of X2 such that FE; «9 if and only if E. = TE, €%2, and then m(E;) = 
m2(EH2). If such a mapping T' exists, X; and X»2 are point isomorphic. Our 
problem in this section is to find necessary and sufficient conditions in order 
that a measure space be point isomorphic to the unit interval. The funda- 
mental concept in this connection is that of a normal space. 

DEFINITION 1. A measure space is proper if it is complete, properly separable, 
and non-atomic, and if it contains a separating sequence of Borel sets. 

DEFINITION 2. A proper measure space is normal #f to each real valued univalent 
Batre function f(x) there corresponds a set Xo of measure zero such that the range, 
f(X — Xo), is a Borel set. 

The following lemmas concerning proper and normal spaces will be useful 
in the sequel. 

LemMA 1. On every proper measure space X(X, @, m) there exist real valued 
bounded univalent Baire functions. 








sitive 
sitive 
what 


nown 
sarily 
ich is 
F(a). 
ction 
h @. 
erval 


nake 
, the 
0 Ct, 
) = 
r the 
y of 
Con- 


ECeS- 
RB = 

and 
10WS 


lism 
nost 


Our 
rder 
1da- 


wble, 


lent 
nge, 


eful 


ued 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 337 


Proor. Since X is certainly separable and non-atomic the construction of 
the proof of Theorem 1 applies. We assert that the real valued bounded Baire 
function f(x) defined by (1) is univalent. For if the set {x | f(x) = a} contained 
more than one point, then the intersection of this set with a Borel set separating 
two of its points could not be expressed in the form {z | f(x) «e£}. Since, how- 
ever, the proof of Theorem 1 establishes that every Borel set has this form, 
f(z) must be univalent. 

Lemma 2. If X(9C, @, m) is a proper measure space with the property that the 
condition of Definition 2 is satisfied by every bounded function then X is normal. 

Proor. Let f(x) be any univalent Baire function, and let G(y) be any con- 
tinuous function which maps the infinite interval, —«» < y < +, in a one 
to one way on a finite interval. Then g(x) = G(f(x)) is a Baire function which 
is univalent and bounded, hence, by hypothesis, there is a set Xo of measure 
zero such that g(X — Xo) isa Borel set. The image of this Borel set under the 
one to one continuous mapping G’(y) is the range f(X — Xo) which is therefore 
also a Borel set. 

Lemma 3. If X(9C, @, m) ts a normal space, B C X is a Borel set, and f(x) 
isa real valued univalent Batre function, then there is a set By) C B of measure zero 
such that f{(B — Bo) is a Borel set. By can even be chosen in the form BX¢ , where 
X; is a Borel set of measure zero, depending on f but not on B. 

Proor. We shall carry out the proof in three steps, first establishing the 
existence of a suitable By corresponding to a fixed B, then showing that Bo 
may even be chosen as a Borel set, and, finally, proving on the basis of our 
separability hypotheses, that we may choose By in the form described in the 
statement of the lemma. 

(i) We observe that the first statement asserts, essentially, that a Borel set 
in a normal space is itself a normal space. Accordingly, using Lemma 2, we 
may assume that f(2) is bounded. Let f’(x) be a bounded univalent Baire func- 
tion on X, (Lemma 1); by appropriate linear transformations of f(x) and of 
f'(x) we can secure 


0< f(z) $1 <f'(z) 


throughout X. Then the function f*(x), defined to be equal to f(x) on B and 
to f(x) on B’ = X — Bisa univalent Baire function on X, hence for a suitable 
set Xo of measure zero, f*(X — Xo) is a Borel set. The intersection of this 
Borel set with the closed interval (0, 1) is also a Borel set: this intersection is, 
however, precisely f(B — Bo), where By = BX. 

(ii) Let B; be a Borel set of measure zero, B, D By. Applying the result of 
(i) to X — B, we may find a set B,; D B, of measure zero such that f(X — Bz) 
is a Borel set. We proceed similarly by induction, choosing B; > B» to be a 
Borel set of measure zero, choosing B, > B; so that f(X — Bx) is a Borel set, 
and soon. We have By C B; C By C B; C --- ; all B, are of measure zero; 
pe odd B, is a Borel set; for n even f(X — B,) is a Borel set. We write 
Bi = }3oB,. Then Bs has measure zero, and, because of the monotone 








336 PAUL R. HALMOS AND JOHN VON NEUMANN 


if a = a is a discontinuity of F(a), then {x | f(x) = ao} is a set of positive 
measure which therefore, (non-atomicity), has Borel subsets of smaller positive 
measure. Such a subset cannot be put in the form f ‘(£), contrary to what 
we have already proved. 

For any %,0 < % S 1, we define f(z) = inf {a| F(a) = %}. It is well known 
(and easily verified) that {(%) is a strictly monotone increasing (not necessarily 
continuous) :-function of %, which increases from 0 to 1 as % does, and which is 
continuous on the left. Moreover the distribution function of f(%) is again F(a), 

For any Borel set £ C X, consider the set } (BE): we assert that the collection 
of all sets of this form, (which clearly forms a Borel field), coincides with @, 
This is true since the increasing character of f(%) implies that every interval 
(0, #) has the form f ‘(£), where £ can even be chosen as an interval. 

Suppose that it ever happens that f-'(#,) = f-'(£2). (We shall now make 
use of the fact that for an arbitrary Baire function g(x), 0 S g(x) S 1, the 
correspondence E — g'(E) = {x | g(x) ¢ E}, is a homomorphism of @ into @, 
ie. that g (X — EF) = X — g'(B), and g (fi + #, +---) = 
g (B,) + g (#2) + ---). If we write &’ = (f, — BF.) + (#, — F)) for the 
symmetric difference between EF, and £2, then it follows from the equality of 
the distributions of f(x) and f(%), that m{f\(2)} = m{f7(B)} = 0. Con- 
versely, of course, f '(£;) = f ’(£2) implies the same result. 

Consequently the correspondence f ‘(£) <= f'(£) is one to one, not neces- 
sarily between @ and (?, but certainly between 8 = B(X) = BQ) and $ = 
BX) = B@). It is clear that this correspondence preserves measure, and 


the homomorphic nature of the mappings Z — f'(#) and E — f'(£) shows 
that it is also an algebraic isomorphism. 
This concludes the proof of Theorem 1. 


2. Normal spaces; the geometric isomorphism theorem 


If Xi(X1, mi) and X2(%X2, me) are measure spaces, a point isomorphism 
between X, and X2 is a one to one mapping from almost all of X; on almost 
all of X2 such that HE; «4 if and only if 2, = TE, ¢%:, and then m(F,) = 
m2(E2). If such a mapping 7 exists, X; and X>2 are point isomorphic. Our 
problem in this section is to find necessary and sufficient conditions in order 
that a measure space be point isomorphic to the unit interval. The funda- 
mental concept in this connection is that of a normal space. 

DEFINITION 1. A measure space is proper if it is complete, properly separable, 
and non-atomic, and if it contains a separating sequence of Borel sets. 

DEFINITION 2. A proper measure space is normal if to each real valued univalent 
Baire function f(x) there corresponds a set Xo of measure zero such that the range, 
f(X — Xo), is a Borel set. 

The following lemmas concerning proper and normal spaces will be useful 
in the sequel. 

Lemma 1. On every proper measure space X(X, @, m) there exist real valued 
bounded univalent Baire functions. 








sitive 
sitive 
what 


10Wn 
arily 
ch is 
(a). 
stion 
h @, 


rval 


nake 
the 
o Ct, 
/ = 
‘ the 
y of 
Yon- 
CeS- 
5 = 


and 
OWS 


ism 
nost 


Our 
‘der 
\da- 
ble, 


lent 
nge, 


oful 


ued 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 337 


Proor. Since X is certainly separable and non-atomic the construction of 
the proof of Theorem 1 applies. We assert that the real valued bounded Baire 
function f(x) defined by (1) is univalent. For if the set {x | f(x) = a} contained 
more than one point, then the intersection of this set with a Borel set separating 
two of its points could not be expressed in the form {z | f(x) ¢«£}. Since, how- 
ever, the proof of Theorem 1 establishes that every Borel set has this form, 
f(z) must be univalent. 

Lemma 2. If X(9, @, m) is a proper measure space with the property that the 
condition of Definition 2 is satisfied by every bounded function then X is normal. 

Proor. Let f(x) be any univalent Baire function, and let G(y) be any con- 
tinuous function which maps the infinite interval, —« < y < +, ina one 
to one way on a finite interval. Then g(x) = G(f(x)) is a Baire function which 
is univalent and bounded, hence, by hypothesis, there is a set Xo of measure 
zero such that g(X — Xo) is a Borel set. The image of this Borel set under the 
one to one continuous mapping G‘(y) is the range f(X — Xo) which is therefore 
also a Borel set. 

Lemma 3. If X(&C, @, m) is a normal space, B C X is a Borel set, and f(x) 
isa real valued univalent Batre function, then there is a set By) € B of measure zero 
such that f(B — Bo) is a Borel set. By can even be chosen in the form BX, , where 
X; is a Borel set of measure zero, depending on f but not on B. 

Proor. We shall carry out the proof in three steps, first establishing the 
existence of a suitable By corresponding to a fixed B, then showing that By 
may even be chosen as a Borel set, and, finally, proving on the basis of our 
separability hypotheses, that we may choose By in the form described in the 
statement of the lemma. 

(i) We observe that the first statement asserts, essentially, that a Borel set 
in a normal space is itself a normal space. Accordingly, using Lemma 2, we 
may assume that f(x) is bounded. Let f’(x) be a bounded univalent Baire func- 
tion on X, (Lemma 1); by appropriate linear transformations of f(x) and of 
f'(x) we can secure 


0S f(z) £1 <f'(2) 


throughout X. Then the function f*(x), defined to be equal to f(x) on B and 
to f’(x) on B’ = X — Bisa univalent Baire function on X, hence for a suitable 
set X) of measure zero, f*(X — Xo) is a Borel set. The intersection of this 
Borel set with the closed interval (0, 1) is also a Borel set: this intersection is, 
however, precisely f(B — Bo), where By = BX». 

(ii) Let B, be a Borel set of measure zero, B; D By. Applying the result of 
(i) to X — By we may find a set B. > B, of measure zero such that f(X — B2) 
is a Borel set. We proceed similarly by induction, choosing B; D B2 to be a 
Borel set of measure zero, choosing By > B; so that f(X — Bs) is a Borel set, 
and soon. We have By C B, C B, C B; C -:: ; all B, are of measure zero; 
pg odd B,, is a Borel set; for n even f(X — B,) is a Borel set. We write 
B) = }\¥0B,. Then Bx has measure zero, and, because of the monotone 





338 PAUL R. HAMLOS AND JOHN VON NEUMANN 


character of the sequence {B,}, By = > %~0 Benyi , So that B3 is a Borel set, 
Similarly X — BJ = X — )°%_5 Bon = [[%-0 (X — Ben), so that f(X — BS) isa 
Borel set, and also B(X — B}) = (B — Bo)(X — Bo), so that f(B(X — B7)) = 
f(B — By)f(X — B>) is a Borel set. We may accordingly change notation and 
denote by By the intersection of B and Bg : this new By is a Borel set of measure 
zero with the property that f(B — Bo) is a Borel set. 

(iii) Let A;, As, --- be a sequence which spans @, and apply the result of 
(ii) to find, for each n, a Borel set A. C A,, of measure zero, such that 
f(A, — A‘) is a Borel set. We write A° = >>%_, A’, , and we apply (ii) once 
more, this time to X — A’, to find a Borel set X) > A’, of measure zero, such 
that f(X -- Xo) isa Borel set. Let us write A, = A, — Aj, and let @ be the 
Borel field (C @) spanned by the A. Then we have (X — Xj)A, = 
(X — X,)A}, for all n, and we see, moreover, that to every Borel set B, (i.e. 
to every set B ¢€@), there corresponds a set B’ ¢ @ such that (X — X5)B = 
(X — X>)B’. Since f(A‘) is a Borel set, and since the collection of sets A for 
which f(A) is a Borel set is clearly a Borel field, (because f is univalent), it 
follows that for every B’ «(’, f(B’) is a Borel set. Consequently for every 
Be@ 


f(B — BX») = f(B(X — Xo) = f(B(X — Xo) = f(B(X — X), 


so that f(B — BX,) is a Borel set, and the proof of the lemma is complete. 
Lemma 4. If X(X, @, m) is a proper measure space, and if for a single real 
valued univalent Batre function g(x) we can find a set Xo of measure zero such that 


g(B — BX) ts a Borel set whenever B is, then X is normal and, moreover, this 
same set Xo will satisfy the condition of definition 2 for any real valued univalent 
Baire function f(x). 

Proor. We write Y = g(X — Xo); for every yoe Y, yo = g(a), we define 
F(yo) = f(a). Fy) is then a real valued univalent function of the real variable 
yeY. Since 


(3) iy| F(y) < a} = gl{x|f(@) < a}(X — X)], 


and since the right member is a Borel set by hypothesis, F'(y) is a Baire function. 
Since f(x) = F(g(x)), we have f(X — Xo) = F(Y), and therefore f(X — Xo) 
is a Borel set.° 

An important class of measure spaces is the class of m-spaces. An m-space 
is a complete measure space X({C, m) on which a metric is defined so that, 
topologically, it is a complete separable space, and which satisfies the following 
two conditions: 

(i) the measure of an open set is positive; 

(ii) for every measurable set E, m(Z) = inf {m(O)| EZ CO, O open}. With 
the Borel field @ of Borel sets (in the usual topological sense of the word) X = 
X(X, @, m) becomes a proper measure space; it is a known result of topology 





5 See F. Hausdorff, Mengenlehre, Berlin, 1935, p. 266. 








| set. 
)isa 
)) = 
and 
sure 


It of 
that 


real 
that 
this 
lent 


fine 
ible 


ion. 


Xo) 
ace 
at, 
ing 
ith 


BY 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 339 


that it is even normal in our sense of the word, and that the exceptional set Xo 
of measure zero may even be chosen as the empty set.” 

We shall use m-spaces later; at present we mention them only as examples of 
normal spaces. The following theorem, the main theorem of the present sec- 
tion, applies to m-spaces, (since they are normal), and shows that, measure 
theoretically, they are isomorphic to the unit interval. 

TurorEeM 2. A necessary and sufficient condition that a measure space of total 
measure one be point isomorphic to the unit interval is that it be normal. 

Proor. The necessity of our condition is obvious: the unit interval is normal 
and normality is invariant under point isomorphism. Before giving a proof of 
sufficiency we remark on the hypotheses. Since the various conditions in the 
definition of a proper space are logically independent, they are obviously indis- 
pensable for a sufficiency proof. It is possible that the condition of normality 
could be replaced by a weaker one, but examples seem to indicate that it is the 
best way of expressing that the space is “measurable in itself.” 

For the proof of sufficiency we use the notations of the proof of Theorem 1; 
in particular we use the functions f(x) and f(%) that we defined there. 

We denote by D and D the ranges of f(x) and f(%) respectively. By omitting 
from X a set of measure zero we may, by normality, assume that D is a Borel 
set; D is also a Borel set. (We observe that the omission of a set of measure 
zero does not change the distribution of f and hence does not change f at all). 
Form the set R = (D — D) + (D—D). Sincef'(D — D) =f "(D) — f(D) 
lies entirely in the complement of f(D), and since this complement is empty, 
f(D — D) is empty. Since f(D — D) has the same measure as f(D — D), 
this proves that the measure of f(D — D) is zero. Similarly we can prove that 
the measure of both f(D — D) and f(D — D) is zero, (and, in fact, the 
latter is empty). Hence if we omit from both X and X a Borel set, namely 
f'(R) and f“(R) respectively, of measure zero, on the remainder f and f are 
univalent Baire functions with identical (Borel measurable) ranges. 

If to every x e X (after the omission, as described, of a set of measure zero), 
we make correspond the point f‘(f(x)) « X, the correspondence is one to one. 
Moreover if B is any Borel set in X, and B’ = f(B), then B’ is a Borel set and 
f'(B’) = B. Consequently, considered as an element of the Boolean algebra 
BX), the correspondent, under the set mapping described in the proof of theo- 
rem 1, of Bisf"(B’) = B =f "(f(B)), so that the point mapping just described 
induces precisely the same set isomorphism between 8 and %. It follows that 
this point correspondence is measure preserving. This concludes the proof of 
Theorem 2. 


3. The relation between set transformations and point transformations 


If T is a measure preserving transformation (i.e. a point isomorphism) of a 
measure space X (XC, m) on itself, then 7 induces a set mapping (of B = B(X) 





"See Hausdorff, op. cit., p. 269. 





340 PAUL R. HALMOS AND JOHN VON NEUMANN 


on itself) by making correspond to every set E eX the set TE eX. It is 
known that in an m-space the converse is true: every set isomorphism is induced 
in this way by a point isomorphism.’ Motivated by this we give the following 
definition. 

DerinitTIon 3. A measure space X (2X, m) has sufficiently many measure pre- 
serving transformations if every set isomorphism of & on itself is induced by a 
point isomorphism of X on itself. 

It follows from Theorem 2 that every normal space has sufficiently many 
measure preserving transformations. In between the two concepts (normal 
spaces and spaces with sufficiently many measure preserving transformations) 
there is, however, room for a pathological occurrence which we shall describe in 
this section. We begin by proving some auxiliary results. 

Lemma 5. If two point mappings, on a measure space X which contains a 
separating sequence E,, E,,--- of measurable sets, induce the same set mapping 
on B then they differ on at most a set of measure zero. 

Proor. It is sufficient to consider the case where one of the transformations 
is the identity. If then TH, and FE, differ only on a set of measure zero, for 
n = 1, 2,---, it follows that all 7"E, differ from each other only on sets of 
measure zero. Hence the invariant set 


F, = > a - E, = i 7s. 


has measure zero. We form the invariant set X’ by omitting from X the set 
>>*_, F, of measure zero. If now « ¥ Tz, then some E,, contains one but not 


both of « and Tx, and therefore x is contained in one but not both of EH, and 
TE,. Consequently xe F, , so that x ¢X’. 

Lemma 6. Let X({X, m) be a measure space and let X’ C X be any (not neces- 
sarily measurable) subset of X. Let SX’ be the collection of all sets of the form 
E’ = X’E, with E €X; for every E’ eX’, E' = X'E, define m'(E’) = m(E). 
With these definitions m’ is uniquely determined (so that X’(2C’, m’) ts a measure 
space) if and only if the outer measure of X’ in X is equal to the measure of X ? 

Lemma 7. If {¢,(x)},n = 1, 2, --+ , 7s a complete orthonormal set of functions 
in L2(X), where X (8X, m) is a measure space which contains a separating sequence, 
E,, E., +--+, of measurable sets, then there is a set N €9C of measure zero such 
that x, y¢ N and $,(x) = ¢,(y) for n = 1, 2,--- , implies x = y. 





’See John von Neumann, Linige Sdtze tiber messbare Abbildungen, Annals of Mathe- 
matics, vol. 33, (1932), p. 582. In definition 5, p. 576, all descriptive properties of the 
transformation (such for example as M, + M;—> M; + M3) should be modified by the phrase 
“neglecting sets of measure zero.” 

® The outer measure of Eo, m*(Eo), is defined by m*(Eo) = inf {m(E) | Eo C E « X}. 
Similarly we may define the inner measure, m«(Eo) = sup {m(E) | Eo DE eX}. If X is 
complete then Eo is measurable (i.e. Eo ¢ XC) if and only if ms(Eo) = m*(Eo) = m*(E») = 
m(Ho). In case X is properly separable it is sufficient to take the supremum and infimum 
over Borel sets H. For the proof of Lemma 6, see J. L. Doob, Stochastic processes depending 
on a continuous parameter, Transactions of the American Mathematical Society, vol. 42, 
(1937), pp. 109-110. 





It is 
luced 
Wing 


 pre- 
by a 


nany 
rmal 
ions) 
be in 


ns a 
ping 


tions 
, for 
ts of 


> set 
not 
and 


CES 
form 
(E). 
sure 
ay 

10ns 
nce, 
such 


tthe- 
the 
ase 


7 
X is 
») = 
num 
ding 

42, 








OPERATOR METHODS IN CLASSICAL MECHANICS, II 341 


Proor. Let ¥m(x) be the characteristic function of £,,; we have 


(4) V(t) = Doni dnmgn(a), 


in the sense of convergence in the mean (or order two). Consequently, for 
each m, a subsequence of the partial sums of the series in (4) converges to 
Ym(t) almost everywhere; for each m we choose a fixed subsequence with this 
property and we let N be the union of all the sets of measure zero at which 
these subsequences do not converge to ¥m(x). If x, y¢N and ¢,(x) = ¢,(y) 
for all n, then it follows that n(x) = Wm(y) for all m, whence (using the fact 
that E,, A.,--: isa Separating sequence) « = y. 

Lemma 8. Let X (X, Q, m) be the perimeter of the unit circle in the complex 
plane, and let u(E) be any measure (2. e. a countably additive, non-negative _ 
function with u(X) = 1) defined forE €@. If for a single number i, with |A| = 
and (arg d)/ /2m irrational, w is invariant under rotation through arg 4, ty 

yak) = u(B) for every E ¢@, then u(B) = m(B£). 

Proor. Let A; and A: be any two closed intervals (ares) of the same length 
in X. Since the sequence {X"} of powers of \ is everywhere dense in X, we may 
find a sequence {n,;} of positive integers, so that 


(5) lim; "? Ai = Ay a 
and consequently 
(6) lim; u(r” A) => u( Ae)" 


Since u(\"/A,) = u(As), we have proved that u(Ay) = w(Az). Thus u(A) is a 
function of the are length of A, i.e. of m(A). This numerical function is clearly 
monotone and additive, hence proportional to m(A). Considering A = X 
shows that the factor of proportionality is 1. Thus u(f) and m(£) agree for 
ares, and therefore for all Borel sets. 

As an immediate consequence of this lemma we observe that if for any Borel 
set Ey we have Ey = XB, , then m(E£,) = 0 or else m(Ey) = 1, for otherwise 


w(L) = m(BE,)/m(Eo) 


would contradict what we just proved. 

After these preliminaries we are now ready to introduce the pathological con- 
cept we mentioned at the beginning of this section. 

Derinition 4. A (not necessarily measurable) subset E of a measure space X 
is absolutely invariant if for every measure preserving transformation T of X on 
itself, the symmetric difference (E — TE) + (TE — E) is measurable and has 
measure zero. 

Lemma 9. If E is measurable and m(E) = 0 or m(E) = m(X) then E is 
absolutely invariant. Conversely if X is separable and non-atomic and E C X 





" See 8. Saks, Theory of the integral, Warszawa, 1937, p. 5. 
" See Saks, op. cit., p. 8. 














se 


eT ee Rete sim): “PWR ar Sceptre ce _ 


342 PAUL R. HALMOS AND JOHN VON NEUMANN 


is measurable and absolutely invariant, then m(E) = 0 or m(E) = m(X); if X is 
not measurable and absolutely invariant then mx(E) = 0, m*(£) = m(X). 

Proor. The first statement is obvious. To prove the remaining statements 
we observe that if 7’ is a measure preserving transformation and if A is a set 
(almost) invariant under 7’, in the sense that (A — 7'A) + (TA — A) is measur- 
able and has measure zero, then any measurable cover, A*, and any measurable 
kernel, Ay , of A are also (almost) invariant under 7.” For A C A* implies 
TA C TA*; since TA and A are almost equal, and 7’ is measure preserving, 
7 A* is a measurable cover of 7A, and therefore TA* + (A — TA) is a measur- 
able cover of A. It follows (since any two measurable covers of A are almost 
equal) that 7'A* is almost equal to A*, as was to be proved. A similar argument 
applies to measurable kernals. 

It follows from the preceding paragraph that if EH is absolutely invariant then 
so are Ey, and E*. If we knew that a measurable absolutely invariant set must 
have measure zero or m(X), we could conclude that for a non-measurable abso- 
lutely invariant FL, m,(£) = 0 and m*(E£) = m(X). In the case where X is the 
perimeter of the unit circle, there are many examples of measure preserving 
transformations whose measurable invariant sets all have measure zero or m(X): 
in fact the rotations described in Lemma 8 are such. If a set is invariant under 
all measure preserving transformations it is d fortior? invariant under these and 
hence if it is measurable it will have measure zero ot m(X). The general case 
is, however, reduced to the case of the circle by Theorem 1. 

To show that the concept of absolute invariance is not vacuous we shall now 
show that non-measurable absolutely invariant sets exist. In the existence 
proof we make free use of the continuum hypothesis and well ordering. 

Lemma 10. If X = X(&X, Q, m) is a proper measure space of total measure 
one, there exists an absolutely invariant set E C X with my(E) = 0, m*(E) = 1. 

Proor. Since on a separable measure space there are at most c (= the power 
of the continuum) set transformations (since a set transformation is completely 
determined by its behavior on a countable collection of sets, and the set of all 
functions from a set of power No to a set of power c has power c), it follows from 
lemma 5 that we may find a set of at most ¢ measure preserving transformations 
of X on itself with the property that every measure preserving transformation 
differs on at most a set of measure zero from one of the given set. Let this set 
be well ordered, so that to each ordinal a < Q (= the first uncountable ordinal) 
there corresponds a measure preserving transformation T,. We may similarly 
enumerate the collection of all Borel sets of positive measure: let these be denoted 
by Hz,a <Q. 

For any xe X and any a < Q we write 


C(x) = {[[faTtiz| a: = a,k =1,2,---3n; = 0, £1, +2, ---}. 





12 4* [or Ax] is a measurable cover [or kernel] of A if it is measurable, if A C A* [or A+ 
CA], and if m*(A) = m(A*) [or m+(A) = m(Ax)]. If Af and A} are measurable covers of 
A then (A; — A}) a (A} - A*) has measure zero. 


it 


fo! 
ha 
ot. 


Se] 


di 


Si 
on 
as 


an 
for 
Cle 


x 
for 








X is 


ants 

set 
sur- 
ible 
lies 
ing, 
ur- 
ost 
ent 


1en 
ust 
S0- 
the 
ing 
X): 
der 
nd 
ASe 


OW 
ce 


ire 


Ax 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 343 


(,(z) is the smallest set containing x and invariant under 7’; for all 8 S a. 
Further relevant properties of C(x) are the following. C.(x) is a countable 
set; fora S B, Ca(x) C C(x); if y¢ Ca(x), then C.(y) and C(x) are disjoint. 


By transfinite induction we now define points x and yz. 2 is chosen in EF, ; 
y: is chosen in E, but not in C;(x). Since C(x) is countable and EF, (being a 
Borel set of positive measure) is not, the choice of y; is possible. If x. and ya 
are defined for all a < 8, we define xg as follows. Since the set 


Diecs {Ca(ta) + Ca(ya)} 


is countable, we may choose 2g e Eg so that xg is not in this set. After this is 
done we may add C,(zg) to this set and choose yg so that yg e Ls , but yg is not 
in the enlarged set. 

Concerning the points x, and y. we now assert: for any @ and 8, a ¥ 8, 
(tq) and Cs(ys) are disjoint. If a < 8, then we know, by definition, that 
ys¢Ca(Xa) so that Cg(ys) and Cg(x_) are disjoint—d fortiori Cs(ys) and Ca(r_) 
are disjoint. If a > 8, then again z, is not in Ca(yg) so that C(x.) and 
C.(ys) are disjoint, and therefore so also are Ca(xq) and Cg(yg). 

We write 


A= a Ca(Za); 
B = Disco Ca(ys); 


it follows that A and B are disjoint. Since A contains x, and B contains yz , 
both A and B have at least one point in common with every Borel set of positive 
measure; consequently X — A and X — B cannot contain any such sets. It 
follows that both A and B have outer measure one (since their complements 
have inner measure zero), and since each is contained in the complement of the 
other, they both have inner measure zero. 

It is now easy to see that A is (almost) invariant under every measure pre- 
serving transformation 7. Given T we may find B < Q, such that T and 7 
differ on at most a set of measure zero. Also we have 


TA = draco TpCa(Xa). 


Since for a = 8, Co(xe) is invariant under 7,, A and 7A can differ at most 
on the countable set Diacs TsCa(ta). Since T,A and TA differ on at most 
a set of measure zero, we have proved that A and 7'A differ on at most a set of 
measure zero. We may choose either A or B for the E of Lemma 10. 

The following two lemmas establish the connection between absolute invari- 
ance and the property of having sufficiently many measure preserving trans- 
formations. 

Lemma 11. Let X(9X, m) be a measure space of total measure one with suffi- 
ciently many measure preserving transformations, and let X’ C X be any subset 
of X with m*(X’) = 1. If X’ is absolutely invariant, then the measure space 
Pha m’) (defined in Lemma 6) has sufficiently many measure preserving trans- 
ormations. 








eh a Mar ge 


344 PAUL R. HALMOS AND JOHN VON NEUMANN 


Proor. The correspondence E = E’ = X’E is a set isomorphism between 
B = BX) and B’ = BX’). Through this isomorphism any set mapping of YX’ 
on itself (i.e. any set isomorphism of 8’ on itself) induces a set mapping of X 
on itself. Since X has, by hypothesis, sufficiently many measure preserving 
transformations, it follows that to any set mapping 7” on X’ there corresponds 
a& measure preserving transformation 7’ of X on itself, such that 7 in- 
duces the same set mapping of X as JT’. Since X’ is absolutely invariant, 
(X’ — TX’) + (TX’ — X’) has measure zero; let N’ be the smallest set invariant 
under 7’ which contains this set of measure zero. We may redefine 7’ on N’ 
to be the identity; the resulting 7 leaves X’ strictly invariant and may therefore 
be considered as a measure preserving transformation of X’ on itself. It is 
clear that this measure preserving transformation induces the set isomorphism 
T’ on X’ and that, therefore, X’ has sufficiently many measure preserving 
transformations. 

Lemma 12. Let X(9X, m) be a measure space of total measure one which has a 
separating sequence of measurable sets, and let X' © X be any subset of X with 
m*(X’') = 1. If the measure space X'(X', m’) (defined in Lemma 6) has suffi- 
ciently many measure preserving transformations then X’ is an absolutely invariant 
subset of X. 

Proor. We use the notation introduced in the proof of Lemma 11. Let T 
be any measure preserving transformation on X; through the correspondence 
E 2 E' = X’E, T induces a set mapping JT” on X’. Since X’ has sufficiently 
many measure preserving transformations, the set mapping 7” of X’ is induced 
by some measure preserving transformation, say S, of X’ on itself. We shall 
prove that for almost every point x eX’, Sx = Tx. 

For any set E ¢X we know that SE’ = S'(X’E) and X’-T”E differ on 
at most a set of measure zero (since S and 7 induce the same set mapping on 
X’): we denote this set of measure zero by Nz, and we write N for the union 
of all Nz, where we allow £ to run through a separating sequence. Let x be 
any point in X’ — N; we assert that Sx = Tx. If this were not true, we could 
find a set E, belonging to the separating sequence used above, such that Sz ¢ £ 
and Tr¢H. Since xe X’, Sx eX’, and therefore x « S'(X’E); since Tx¢ E, 
d fortiori x¢ X'-T'E. It follows that x eN, C N; since this contradicts the 
choice of x, we must have Sx = Tx. 

We have proved that T leaves almost every point of X’ in X’: in other words 
X’ is almost invariant under 7. Since T was arbitrary, it follows that X’ is 
absolutely invariant. 

We conclude this section with an isomorphism theorem that makes clear the 
structure of measure spaces with sufficiently many measure preserving trans- 
formations. 

THEOREM 3. A necessary and sufficient condition that a proper measure space 
of total measure one have sufficiently many measure preserving transformations is 
that it be point isomorphic to an absolutely invariant subset of the unit interval. 

Proor. Since the property of possessing sufficiently many measure preserving 








veen 
yf X’ 
of X 
ving 
ods 

in- 
ant 
lant 
1 N’ 
fore 
t is 
‘ism 
ving 


18 a 
vith 


rant 


t T 
nce 
tly 


ced 
hall 


ace 
> 18 


ing 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 345 


transformations is invariant under point isomorphism, and since, by Lemma 11, 
an absolutely invariant set has this property, sufficiency is clear. 

To prove necessity we first observe that the given measure space, X (9X, @, m) 
is set isomorphie with X (X, @, m) in virtue of Theorem 1. (It will be most 
convenient in this proof to think of X as the perimeter of the unit circle in the 
complex plane.) Consider on X the measure preserving transformation 
i — dz, where A € X is a fixed number with (arg \)/2z irrational. The set iso- 
morphism between X and X makes correspond to this transformation on X a 
certain measure preserving transformation 7 on X. A set isomorphism may 
also be considered as a mapping of the characteristic functions of X on the 
characteristic functions of X: this mapping may be extended to all Lo(X) and 
thus generates an isomorphism between L,(X) and L2(X). Let $(x) be the 
correspondent on X of the function $(%) = % on X; the function (x) has the 
following properties: 


(i) | o(x) | = 1; 

(ii) o(Tx) = r(x); 

(iii) {o"(x)} = {@(x))"}, n = 0, 41, +2, --- , is a complete orthonormal set 
in L,(X). 


(To be precise: since (x) is determined only up to a set of measure zero, proper- 
ties (i) and (ii) need to be true only almost everywhere. It is clear, however, 
that by changing @ on a set of measure zero we may assume that (i) and (ii) 
are always true. We may also assume, and we find it convenient to do so, that 
g(x) is a Baire function.) 

We apply Lemma 7 to {¢"(x)} to obtain a set N of measure zero with the 
property described there. By increasing N, if necessary, we may assume that 
N is invariant under 7. We now omit the points of N from X: we shall show 
that the remainder (henceforth to be denoted by X again) is in one to one 
measure preserving correspondence with an absolutely invariant subset of X. 

The function 2’ = ¢(a) defines a mapping from X to X; we know that this 
mapping is Borel measurable (i.e. that the inverse image of a set in @ lies in Q), 
and we assert furthermore that it is univalent. For if we had ¢(x) = ¢(y), 
then we should also have $"(x) = $"(y) for all n, and this possibility is precisely 
what we eliminated when we threw away the set N.. 

The transformation T is carried by the mapping ¢ into some transformation 
I” of the range ¢(X) = X’ C X into itself; since 


T'x' = o(T¢ '(2’)) = dv’, 


we see that X’ is invariant under the rotation Z — dz. 

For every Borel set F.C X (ie. Fe @) we define u(B) = m(@ (E£)). Since 
o(T¢"(E)) = X’-yB, we have T$7(B) = 6 °(X’-iZ) = 6 (AE). Since T is 
measure preserving it follows that 


u(E) = m(¢"(B)) = m(Te"(£)) = m@"(AB)) = wQE#). 








= eee © 


EOP 1e 


346 PAUL R. HALMOS AND JOHN VON NEUMANN 


Hence, by Lemma 8, u(£) = m(F). 

Suppose, finally, that £, and £, are Borel subsets of X for which X’E, = X’E, 
Write E = (£, — £.) + (£, — E£,): it follows that X’Z is empty, so that 
¢ '(B) isempty and u(£) = m(¢ “(B)) = m(£) = 0. This implies that m(E;) = 
m(E2); it follows from Lemma 6 that m*(X’) = 1. 

To sum up: we have proved that X is point isomorphic with a possibly non- 
measurable subset X’ of X, with m*(X’) = 1; since X has sufficiently many 
measure preserving transformations, so does X’. Lemma 12 now applies: X’ 
is absolutely invariant and the theorem is proved. 


4. Application of the geometric isomorphism theorem to measure preserving 
transformations 


In this section we shall have occasion to use certain facts about measure pre- 
serving transformations and the Pontrjagin duality theory: we describe briefly 
the parts of these theories that we need. Throughout the remainder of our 
work we consider only normal spaces of total measure one. 

Two measure preserving transformations T; and T; , defined, say, on X, and 
X2 , are (point —) isomorphic’ if there is a point isomorphism 7 from X, to X; 
with the property that 77,7 is almost everywhere equal to T:. With every 
measure preserving transformation 7’ we associate a unitary transformation U 
defined on Lo(X) by Uf(x) = f(Tx). A measure preserving transformation T 
has pure point spectrum if U has; in other words if there exists a complete ortho- 
normal sequence, {f,(x)} of functions in L2(X) and a sequence A = {A,} of 
complex numbers (of absolute value one) such that f,(Tx) = Anfa(x) almost 
everywhere, forn = 1,2,---. Tis ergodic if f(Tx) = f(x) almost everywhere, 
with fe Ll2(X), is equivalent to f(z) = constant almost everywhere. The 
spectrum, A, of an ergodic measure preserving transformation with pure point 
spectrum is a subgroup of the multiplicative group of complex numbers of ab- 
solute value one. The numbers X, ¢ A are, moreover, a complete set of invariants 
of 7, in the sense that if two measure preserving transformations with pure 
point spectrum have the same set A of eigenvalues with the same multiplicities 
then they are isomorphic.” 

Concerning groups we shall need the following. A compact abelian separable 
topological group, X, as an m-space, in the sense that we may define on it an 
invariant metric d(x, y) and (unique) invariant Haar measure m(Z) in such a 
way that it becomes an m-space.” Let A’ be the character group of X; i-. 
A’ is the set of all complex valued continuous functions f(x) with | f(x) | = 1 





18 Since this is the only kind of isomorphism for measure preserving transformations that 
we shall use, we shall in the sequel omit the qualifying ‘point —’. 

4 All these statements are proved in (J) for flows: it is easy, however, to make the trans- 
lation from the one parametric case to the discrete case. 

16 Invariance means that for all points z, y, and a, and all measurable sets E, we-have 
d(x, y) = d(az, ay) and m(aE) = m(E£). We find it convenient to write all groups multipli- 
catively, even though they are abelian. 





anc 
con 
gro 
fun 


gro! 
by, 
enti 


whe 
tior 
the 
nor’ 


spec 
abel 


tion 
we | 
we 
pre 
fort 
fun 
mol 
ing 
the 








y 3 ; 
that 


10n- 
any 
x! 


pre- 
efly 
our 


and 
(Ke 
ery 
1U 
iT 
ho- 

of 
ost 
re, 
The 
int 
ab- 
nts 
ure 
les 


ble 


na 
i.e. 
hat 


N§> 


ave 
pli- 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 347 


and f(zy) = f(x)f(y). Then A’ is countable, and the functions f(x) « A’ form a 
complete orthonormal set in L2(X). Conversely let A be any countable abelian 
group, and let X be its character group; i.e. X is the set of all complex valued 
functions x(A), defined on A, with | x(A) | = 1 and x(Au) = z(A)a(u). X may 
be so topologized that it becomes a compact separable (and, of course, abelian) 
group. If to every \ «A we make correspond the function f(x) on X, defined 
by f(z) = 2(A) then this correspondence is an isomorphism between A and the 
entire character group A’ of X.° 

The fact that Haar measure is invariant means that the rotation x — az, 
where a is any fixed element of the group, is a measure preserving transforma- 
tion. The point of introducing the seemingly irrelevant compact groups into 
the study of measure preserving transformations is that such rotations are 
normal forms for a large class of transformations. 

THEoREM 4. An ergodic measure preserving transformation with pure point 
spectrum on a normal space is isomorphic to a rotation on a compact separable 
abelian group. 

Proor. Let A be the spectrum of the given measure preserving transforma- 
tion; let X be the character group of A, and A’ that of X. If for every Xe A 
we define a(A) = A, then a = a(A) isin X. For every x e X we define Tx = az; 
we assert that 7’ has pure point spectrum and that its spectrum is simple and 
precisely equal to A. It has pure point spectrum because the characters f(x) € A’ 
form a complete orthonormal system on L2(X), and every such f is an eigen- 
function of 7’ belonging to the eigenvalue f(a), f(ax) = f(a)f(x). This shows, 
moreover, that the spectrum of 7’, including multiplicities, is obtained by form- 
ing the numbers f(a) for all fe A’. Since to each f there corresponds (through 
the isomorphism described above) an element \ ¢ A for which f(x) = x(A) for 
all z, we see that we may equally well form the numbers a(A), i.e. A, for all 
heA. Hence T is ergodic and it follows, from the previously quoted result of 
(I), that the given transformation and 7’ are isomorphic. 

Since in this proof we used only the group A of eigenvalues and not the actual 
transformation we have also the following corollary. 

CoroLuary 1. Every countable group of complex numbers of absolute value one 
is the spectrum of an ergodic measure preserving transformation with pure point 
spectrum. 

Theorem 4 also enables us to characterize the set of all transformations which 
commute with a given ergodic transformation with pure point spectrum. The 
solution of this problem for general measure preserving transformations is 
probably very difficult. 

Corotiary 2. J fx — ax = Tx is an ergodic rotation on a compact abelian 
group X and if S is any measure preserving transformation on X for which 
ST = TS then S is also a rotation. 





* For the proof of all these statements see L. Pontrjagin, Topological groups, Princeton, 
1939, Chapter V. 











348 PAUL R. HALMOS AND JOHN VON NEUMANN 


Proor. We have S(ax) = aS(z), so that if we write b(z) = Sx-x', then 
b(Tx) = S(ax)(axr)* = Sa-x™* = b(2). 


In other words b(z) is invariant under 7; since T is ergodic b(x) = b = constant" 
and Sx = bz, as was to be proved. 

We shall call a measure preserving transformation R an involution if R’ = ] 
(= the identity), and we shall call an involution a factor of a given transforma- 
tion T if S = RT is also an involution (so that T = SR). 

Corouuary 3. If x— ax = Tx is any rotation on a compact abelian group X, 
then T may be factored, T = SR, S’ = R’ = I; if T is ergodic every factor R of 
T is a reflection, Rx = bx. 

Proor. Clearly if Rx = ba” then R is an involution; also Sx = RTx = 
R(ax) = ba '-x” is an involution. Conversely if T is ergodic and if T = SR, 
S’ = R’ = I, then TRT = SR-R-SR = R, so that aR(ax) = Rx. It follows 
as in the proof of Corollary 2 that b(z) = x-R(x) is invariant under 7, (ic. 
b(ax) = ax-R(ax) = x-R(x) = b(x)), so that b(z) = b = constant, and 
Rz = br. 

Corotiary 4. Any ergodic measure preserving transformation T with pure 
point spectrum is isomorphic to its own inverse, T' = RTR™, where R may even 
be chosen as an involution. 

Proor. From Corollary 3 we know that T = SR, S’ = R’ = I;since T” = 
R'"S”* = RS, we have T* = R-SR-R = R-T-R™. 

There seems to be some reason for the conjecture that the results of Corol- 
laries 3 and 4 are valid for an arbitrary measure preserving transformation. 

We have seen that every rotation is a measure preserving transformation with 
pure point spectrum; the question arises as to when a rotation is ergodic. The 
following theorem asserts that for rotations ergodicity (i.e. metric transitivity) 
is equivalent to regional transitivity.” 

THEOREM 5. If a is a fixed element of the compact abelian group X, the rotation 
x — ax is ergodic if and only if the sequence {a"} is everywhere dense in X. 

Proor. If x — az is ergodic then the iterates of some point, say 2, are 
everywhere dense.” Since the transformation « — x-z)° is a homeomorphism, 
it carries the sequence {a"z} of iterates of 2 into a dense sequence; but 
a"xry = a". 

Suppose, conversely, that {a"} is everywhere dense. We have already seen 
that any rotation has every function f in the character group A’ of X for an 
eigenfunction, and that the functions of A’ are a complete orthonormal set in 
L,(X). Since eigenfunctions belonging to different eigenvalues are orthogonal, 
every function invariant under the rotation x — ax must be a linear combina- 





17 The definition of ergodicity says that numerically valued invariant functions are 
constant. It is easy to verify that this implies the same result for functions (such as 6(z)) 
whose values are in the group X. 

18 For a discussion of the various kinds of transitivity see G. A. Hedlund, The dynamics of 
geodesic flows, Bulletin of the American Mathematical Society, vol. 45, (1939), p. 243. 

19 See Eberhard Hopf, Ergodentheorie, Berlin, 1937, p. 29. 








hen 


tant” 


> 
. 


orma- 


up X, 
RK of 


[x = 
= SR, 
lows 
, (Le, 
and 


pure 
J even 


orol- 
n. 
with 

The 
vity) 


ation 


, are 
lism, 
but 


seen 
r an 
et in 
onal, 
yina- 


s are 


b(z)) 


ics of 





OPERATOR METHODS IN CLASSICAL MECHANICS, II 349 


tion of the invariant functions of the set A’: if the only invariant function in 
\'is f(x) = 1, the rotation is ergodic. Suppose then that f(ax) = f(x) for some 
fed’. It follows (taking x to be the unit element of X, x = 1) that f(a") = 
{(1) = 1 for all n; since {a"} is dense and f is continuous it follows that f(x) = 1. 

To introduce the final result of this paper we observe that Theorem 4, and 
the existence of an invariant metric on any compact separable group, imply 
that every ergodic measure preserving transformation with pure point spectrum 
is isomorphic to an isometric transformation on an m-space. Consersely: 

TurorEM 6. If T is an ergodic measure preserving transformation on an m-space 
X(X, m) such that-to every « > O there corresponds a 5 = 6(e) > 0 in such a way 
that d(x, y) < 5 implies d(T"x, T"y) < €,n = 0, +1, +2, --- , (in other words 
if the family {T"} of transformations is equicontinuous), then T has pure point 
spectrum: in fact it ts possible to introduce into X a multiplication so that it be- 
comes (with the original topology of X) a compact separable abelian group and T 
becomes a rotation. 

We comment first of all on the hypothesis. Since an isometric transforma- 
tion clearly has the described equicontinuity property, on the face of it our 
hypothesis is weaker than isometry. But if our hypothesis is satisfied we may 
introduce into X a new metric, d’(x, y), defined by 


d'(z, y) = sup {min(1, d(T x, T"y)) | n= 0, +1, +2, see}; 


it is easy to verify that d and d’ induce the same topology on X, and 
that d’(Tx, Ty) = d'(x, y). We may (and do) therefore assume that T' is iso- 
metric in the first place. 

We shall make the proof of Theorem 6 depend on the following two lemmas 
which have an interest of their own. 

Lemma 13. If on an m-space X there exists an ergodic and isometric measure 
preserving transformation then X is compact. 

Proor. Let 7 be an ergodic and isometric transformation; since X is com- 
plete we have to show only that it is totally bounded. If it is not, then there 
isan e > 0 and an infinite sequence of points 2, x2, ---: in X such that the 
open spheres S, of radius ¢ with center at x, are pairwise disjoint. Let x be 
any point of X whose iterates {7"a} are everywhere dense in X, and choose 
foreach n = 1, 2, --- an integer k = k(n) such that d(x, , T"x) < «/2. If we 
denote by So the open sphere of radius ¢/2 with center at 2% , then for each n, 
rS, C S,, so that m(S,) = m(S)) > 0. Since a measure space has, by 
definition, finite measure, there cannot exist an infinite sequence of pairwise 
disjoint sets whose measure is bounded away from zero; it follows that X is 
totally bounded and therefore compact. 

Lemma 14. Let X be any compact group (not necessarily separable or abelian) 
and let m(E) be any finite measure, defined (at least) for all Borel sets of X, such 
that the measure of an open set is positive and that the measure of any measurable 
set is the lower bound of the measure of open sets containing it. Then the set Xo 
! = «eX for which m(xE) = m(E) for all measurable sets E is a closed subgroup 
OY A. 





350 PAUL R. HALMOS AND JOHN VON NEUMANN 


Proor. Since x e Xo and y € Xo implies 
m(ayE) = m(y"E) = m(y(y"E)) = m(E), 


X,, and consequently its closure Xo , is a subgroup; we shall prove X, C X,, 

Take x e Xo, and let E be any closed (and hence compact) subset of X. Let 
O be any open set, O D xE, and let N be a neighborhood of 1 (= the unit element 
of X) such that for aeN,axE CO. Then Nz is a neighborhood of z, so that 
the intersection of Nx and Xo is not empty; say y = az,aeN,yeXo. Then 


m(E) = m(yE) = m(azE), 


and since atE C O, m(E) S m(O). Inother words xE C O implies that m(E£) < 
m(O): our condition on m implies that m(#) S m(xE). Applying this result 
to the compact set xE and the point x € Xo (in place of E and x) we obtain 
m(xE) < m(E), so that m(xE) = m(£) for all closed sets EZ. It follows that 
m(xE) = m(E) for all measurable sets EH, as was to be proved. 

Proor or THEOREM 6. Let 2 be any point in X for which {72} is every- 
where dense; write z, = T"x) forn = +1, +2,---. Forx = 2, andy = zy, 
we define p(x, y) = tn4m, and r(x) = zn. If ce’ = ay, 2" = ay ,y! = Ie, 


y”’ = Lm, then 


d(p(x’, y’), p(x’, y"")) = A(tnr4m y Lntr+m’’) 
S d(Lnr4m? y Large) + d(Lnr¢ my Lntr¢m’’) 
= A(Xm , Lm) + A(Xne , Ln’) 
= dy’, y”) + d(x’, 2”); 


in other words p(z, y) is uniformly continuous throughout its domain of defini- 
tion; similarly since we have 


d(r(x), r(y)) = d(t_n, Lm) = A(tntn+m, T-mintm) = dy, 2), 


r(x) is uniformly continuous throughout its domain. The domain of p(z, y)is 
an everywhere dense subset of the product space of X with itself, and the domain 
of r(x) is an everywhere dense subset of X, consequently they each have a unique 
continuous extension, to all the product space and all X respectively. 

The rest of the proof is now easy. We define, for every x and y in X, ry = 
p(x, y) and a = r(x); it is clear that with these definitions X becomes an 
abelian topological group. We may write, for any x = 2, and an arbitrary y, 
p(x, y) = T"y; then p’(z, y) is a continuous extension of our original p(z, y) 
and therefore (because of the uniqueness of extension) T"y = z,y. (For n = 1, 
we obtain, in particular, Ty = xy for all y. The originally chosen element 2 
is now the unit element of the group.) If E is any measurable set then T"E = 
«,E has the same measure as E, so that measure is preserved by an everywhere 
dense set of x’s; since, by Lemma 13, X is compact, Lemma 14 implies that for 
all x and all measurable sets E, m(xZ) = m(E). The uniqueness of Haar 
measure implies that m is the Haar measure of the group X; this completes the 
proof that 7’ is a rotation, and hence has pure point spectrum. 


INSTITUTE FOR ADVANCED Stupy 








— X. 
Let 
ement 
0 that 
Then 


(E) s 
result 
»btain 
s that 


Vvery- 
= In 


= Lm’, 


lefini- 


» ¥)IS 
ymain 
nique 


ry = 
es an 
ry Y; 
(x, y) 
=1, 
nt 2X 
B= 
where 
at: for 
Haar 
s the 





AywaLs OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


THE BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 
By J. L. Doos 
(Received January 14, 1942) 


The irregular movements of small particles immersed in a liquid, caused by 
the impacts of the molecules of the liquid, were described by Brown in 1828.’ 
Since 1905 the Brownian movement has been treated statistically, on the basis 
of the fundamental work of Einstein and Smoluchowski. Let z(t) be the 
z-coordinate of a particle at time ¢. Einstein and Smoluchowski treated z(t) 
asa chance variable. They found the distribution of x(¢) — x(0) to be Gaussian, 
with mean 0 and variance a |t|, where @ is a positive constant which can be 
calculated from the physical characteristics of the moving particles and the given 
liquid. More exactly, such a family of chance variables {x(¢)} is now described 
as the family of chance variables determining a temporally homogeneous differ- 
ential stochastic process: the distribution of z(s + ¢) — 2(t) is Gaussian, with 
mean 0, variance a|¢t|, and ift; << --- < th, 


a(te) — x(tr), e** , 2(tn) — 2(tn—1) 


are mutually independent chance variables. Wiener, who was the first to dis- 
cuss this stochastic process rigorously, proved in 1923 that the functions z(t) 
of this stochastic process are continuous, with probability 1.” This is of course 
a desirable result, which makes the stochastic process somewhat more acceptable 
as the mathematical idealization of the Brownian movement. It was not ex- 
pected, however, that the above distribution of x(s + t) — 2x(s) would prove 
correct for small ¢. Even if the derivation did not break down for small t, the 
mathematical fact that z(s + #) — 2(s) has standard deviation a | ¢| so that 
x(s + t) — x(s) is of the order of magnitude of | ¢|', implying that dx(s)/ds 
cannot be finite, would suggest the desirability of moditications of the Einstein- 
Smoluchowski distributions. In fact it is easily seen that (with probability 1) 
a(t) is not even of bounded variation, so that the path curves of the Einstein- 
Smoluchowski process have infinite length! 

A different stochastic process describing the x(¢) was in fact derived in 1930 
by Ornstein and Uhlenbeck (15),* and later by S. Bernstein (1), (2) and Krutkow 
(11), all using different methods. This new distribution of z(s + t) — 2(s) is 





‘For a historical account of the subject up to 1913, see Haas-Lorentz (6). (The 
numbered references will refer to the bibliography at the end of the paper.) 

* Wiener (18, pp. 148-151) has since given a mofe simple proof. For a discussion of the 
exact meaning of such a statement concerning the continuity of paths, ef. Doob (3) and (5), 
{2. The result means that x(t) can be treated as representing one of a multiplicity of 
continuous functions of t, and integrated, ete. Probability here is formally the study of 
measures on certain spaces of functions. 

_ §Cf. also Ornstein and Wijk (16) and Wijk (17). References to work since 1913 are given 
in Ornstein and Uhlenbeck (15). 


351 





352 J. L. DOOB 

Gaussian, with mean 0 and variance (a/8) (Fl — 1 + Blt), approximately 
a|t| for large t, but apt’ /2 for small t. (Here 8 is a second physically deter- 
mined constant.) 

The purpose of the present paper is to apply the methods and results of 
modern probability theory to the analysis of the Ornstein-Uhlenbeck distribu- 
tion, its properties and its derivation. It will be seen that the use of rigorous 
methods actually simplifies some of the formal work, besides clarifying the 
hypotheses. A stochastic differential equation will be introduced in a rigorous 
way to give a precise meaning to the Langevin differential equation for the 
velocity function dx(s)/ds. This will avoid the usual embarrassing situation in 
which the Langevin equation, involving the second derivative of x(s) is used to 
find a solution z(s) not having a second derivative. 


1. The velocity distribution 


The displacement function x(¢), as discussed by Ornstein and Uhlenbeck, has 
a derivative u(t), and all the probability relations needed can be derived from 
those of u(t), as will be seen below. The distribution of w(t) can be described 
as follows: the conditional distribution of u(s + ¢) (¢ > 0) for given u(s) = w, 
is Gaussian, with mean we ”’ and variance (1 — ¢* ‘). Here o}, 8 are 
physically determined constants. When t — ~, this distribution becomes the 
Maxwell distribution of velocities, furnishing stationary absolute (uncondi- 
tioned) probabilities for the process, if these are desired. Using these absolute 
probabilities, which make the distribution easier to describe, the full description 


of the u(t) distribution can then be stated as follows: for each t, u(é) is a chance 
variable with a Gaussian distribution, having mean 0, variance o ; the transition 
probabilities are as just described; the process is a Markoff process.* This last 
fact means that the Maxwell distribution of u(t) for each fixed t& , and the 
transition probabilities just described determine the full set of probability rela- 
tions of the process. Under these conditions, if t; < ft, the pair u(t:), w(t.) has 
a bivariate Gaussian distribution, with zero means, equal variances o , and 


correlation coefficient e °*~"”. This stochastic process goes back at least to 


Smoluchowski, although it was first derived by Ornstein and Uhlenbeck as the 
process describing the velocity of a particle in Brownian motion. Ornstein and 
Uhlenbeck were only interested in the transition probabilities. The formal 
manipulations made below will show that there are technical advantages in 
defining (unconditioned) probabilities for the u(t) also. The above described 
process will be called the O. U. process below. 

The following theorem shows that such a process is essentially determined by 
three fundamental properties, of which at least the first two have simple physical 





4 A process is called a Markoff process if whenever t; < --- < tn, the conditional distribu- 
tion of u(t,) for given values of u(t), «++, u(tr_1) actually depends only on u(tn-1). It is 
in this case, and only in this case, that the Smoluchowski equation between the transition 
probabilities, and the Fokker-Planck differential equations for the transitional probabili- 
tiesare valid. 








mately 
deter- 


ults of 
stribu- 
gorous 
ng the 
gorous 
or the 
tion in 
ised to 


*k, has 
1 from 
eribed 
= %, 
B are 
es the 
condi- 
solute 
‘iption 
‘hance 
sition 
is last 
id the 
y rela- 
fp) has 
, and 
ast to 
as the 
n and 
ormal 
yes in 
cribed 


ed by 
sical 


stribu- 
It is 
nsition 
babili- 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 353 


significance. (We can exclude Case A of the theorem, since it obviously does 
not fit the physical picture.) 

TueorEM 1.1. Let u(t) (—° <t < +) bea one-parameter family of chance 
variables, determining a stochastic process with the following properties. 

1. The process is temporally homogeneous.’ 

2. The process is a Markoff process. 

3. If s, tare arbitrary distinct numbers, u(s), u(t) have a (non-singular) bivariate 
Gaussian distribution. 

Define m, oo by 
(1.1.1) m = E{u(t)}, 09 = E{[u(t) — my}. 

Then the given process is one of the following two types. 

(A) If th < +++ < th, wh), +++, U(tr) are mutually independent Gaussian 
chance variables, with mean m and variance a . 

(B) (0. U. process) There is a constant B > 0 such that if h <--- < th, 
u(t), -** , U(tn) have an n-variate Gaussian distribution, with common mean m 
and variance «3, and correlation coefficients determined by the equation 
E{{u(t) — m][u(s) — m]} = oe", 

Instead of considering u(t), we can consider (1/o)[u(t) — m], which has 
mean 0 and variance 1. Then we shall assume in the following that u(t) itself 
has these properties: m = 0, 09 = 1. Let p(t) be the correlation function: 
p(t) = E{u(s + t)u(s)}, independent of s by Property 1. Ifs < t, the condi- 
tional distribution of u(t) for given u(s) has density 


1 1 [u(t) — et) a 
(27)3 (1 sie p?)! exp ( 2 T - p ? = p(t 8), 
(Property 3). Ift, < +--+ < ti, w(t), +++, w(tn) then have an n-variate Gaus- 
sian distribution with density 


1 1 1 n—l ’ = bs 2 
ra nm n—l exp (- 5M _ F ) » we | i) ' 
(43) x)? TG — of) — 

1 








(1.1.2) 








pi = pltiya — t), uj = ult) 
using Property 2. Now if uw, -°+*+ , un have an n-variate Gaussian distribution 
with density 


1 1 
(1.1.4) x oP (- - »» au) : 
4 = det(Z{u,u;}) is the determinant of the matrix of variances and covariances, 
and (a;;) is the inverse of this matrix. Using these facts we can calculate 
(ts — 4) = E{uyus} in (1.1.3) with n = 3, and find that p(s — 4) = pps, 
that is 


a 


* That is, the probability distributions are unaffected by translations of the t-axis. 
* The expectation of the chance variable v will be denoted by E {v}. 








354 J. L. DOOB 


(1.1.5) p(ts — 4) = p(te — ti)p(ts — &). 


Then p(¢) is an even function; | p(¢) | S 1 (Schwarz’s inequality) ; and according 
to (1.1.5) p(s + t) = o(s)p(é) for all positive s, ¢. Under these conditions either 
p(t) = 0 or there is a constant 8 = O such that 


(1.1.6) p(t) = e Pl, 


In the present case, 8 > 0, by Property 3 (non-singularity of the given bivariate 
distributions). Evidently p(t) = 0 furnishes Case A of the theorem, which 
certainly has the three given properties. If p(t) is given by (1.1.6) with 6 > 0, 
we show first that the matrix (a; ;), the inverse of (p(t; — ¢;)) actually determines 
a Gaussian density distribution (1.1.4). To see this we consider the density 
function (1.1.3) with p; = e °“i*!"?, The coefficients of the quadratic form 
in the exponent of (1.1.3) are easily evaluated and the matrix of the form is 
found to be the inverse of the matrix (e*'**“'). Thus (1.1.3) actually is the 
required probability density. Moreover the probability densities obtained in 
this way (as the ¢; vary) are mutually consistent, because integrating out any 
variable leaves a quadratic form of the same type, without the integrated 
variable, but with the same rule determining the coefficients. The correlation 
function (1.1.6) therefore determines a stochastic process. The process ob- 
viously is a Markoff process because of the form of the probability density (1.1.3): 
an initial factor involving uw only, followed by the product of functions of pairs 
of adjacent variables. The proof of the theorem is now complete. 

According to a theorem of Khintchine ((9) p. 608), p(t) is the correlation func- 
tion of a temporally homogeneous stochastic process if and only if it can be put 
in the form 


(1.1.7) p(t) = [ cos At dF (A), 


where F(A) is monotone non-decreasing and bounded. In Case B of the theorem, 
(1.1.7) is true when F(A) is given by 


_ 2Bos f* adr 
(1.1.8) F(x) = I ea 


In the stochastic process of Case B, the variance of u(s + t) — u(s) is 
2058 | t| for small t: 


(1.1.9) E{{u(s + t) — u(s)P} = 20011 — e*'"!) w 2038 | t|. 


Thus u(s + t) — u(s) is of the order of magnitude of | t (*, and du/dt cannot 
exist. Physically this means that the particles in question do not have a finite 
acceleration (if the given stochastic process represents the Brownian movement 
that closely). 

THEOREM 1.2. If u(t) is the representative function of the stochastic process of 
Theorem 1.1 Case B, u(t) is a continuous function of t, with probability 1. 








cording 
s either 


variate 


which | 


B > 0, 
rmines 
lensity 
e form 
‘orm is 
is the 
ned in 
ut any 
erated 
elation 
ss Ob- 
1.1.3): 
f pairs 


1 fune- 
be put 


20rem, 


u(s) is 


-annot 
, finite 
ement 


cess of 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 355 


Let v(t) be determined by the equation 


| = fy,(+ 

(1.2.1) v(t) = ¢ u( log ‘), t> 0. 

Then v(é) has the property that if 4 <--- < t,, v(t), ---, v(t.) have an 
n-variate Gaussian distribution. We find by direct calculation (taking m = 0): 


E{v(s + t) — v(s)} = 0, 
(1.2.2) E{[v(s + t) — v(s)J} 
E{[v(s2) — v(s:)][v() — o(4)]} = 0, (1 <& St <b). 


Then v(t) determines a differential process—in fact precisely the original Einstein- 
Smoluchowski process. Since Wiener has proved continuity of the path func- 
tions in this case, the theorem follows. 

The transition from u(é) to v(t) just used reduces every property of the 
Ornstein-Uhlenbeck stochastic process to a corresponding property of the 
Einstein-Smoluchowski process, and vice versa. Many properties of the indi- 
vidual functions of the latter process, that is, properties possessed by almost all 
the individual functions, in other words possessed ‘“‘with probability 1,” have 
been proved in recent years, besides the continuity property we have just used. 
The following theorem gives the counterparts of two of these for the O. U. 
process. 

THEOREM 1.3. If u(t) is the representative function of the O. U. process of 
Theorem 1.1 Case B, 


2 
ool, 


: t) — u(0) . u(t) 
(1.3.1) lim su ut =], lim sup —,——; = 1, 
a (4048t log log (1/t))* “a (2c log t) 


with probability 1. 
Let v(t) be defined by (1.2.1). Then Khintchine ((10) pp. 68-75) has proved 


(1.3.2) lim sup Al +f) — At), 1, limsup ass 


+0 (2aht log log (1/t))? t+ (Qoat log log t)’ 


and (1.3.2) becomes (1.3.1) when v(t) is expressed in terms of u(t). 











2. The distribution of displacements 


It does not seem to have been realized by earlier writers that the distribution 
of displacements in the O. U. process can be obtained directly from that of the 
Velocities. In fact, we have seen that as.t varies, u(t) considered as one of a 
multiplicity of continuous functions of ¢. Integration of u(é) is therefore ad- 
missible, and will give the displacement function. If 2x(¢) is the z-coordinate 
of a particle at time t, 


(2.1) z(t) — 2(0) = I u(s) ds 





356 J. L. DOOB 


with probability 1 (that is, neglecting the discontinuous u(t) functions which 
have total probability 0). The main advantages of the rigorous approach to 
stochastic processes depending on a continuous parameter is precisely that the 
u(t) of the rrocess, as ¢ varies, can be regarded as an individual function or 
rather, as one of many functions with whatever regularity properties the given 
probability distributions imply. Theorem 1.3 limits the actual upper bounds 
of the velocity functions u(t). The following result takes advantage of the 
oscillations in sign. 

TureorEM 2.1. Jf u(t) is the representative function of the O.U. process of 
Theorem 1.1 Case B, with m = 0, 


(2.1.1) lim + Gack. 


t—0 0 t—*0o t 


0, 


with probability 1. 

This theorem is simply the ergodic theorem applied to the w(t) process to give 
the strong law of large numbers, (cf. Doob (4) p. 294). From (2.2.3) below, it 
is quite obvious that the expectation of the square of the left side of (2.1.1) 
goes to 0 as t — ~, so that the left side goes to0 in the mean. The strength of 
(2.1.1) is that it is a statement about the path of the individual path functions, 
or physically, a statement about the path of a single particle. The same was 
true in Theorems 1.2 and 1.3. 

In order to find the distribution of x(¢) — x(0) we proceed as follows. Rie- 
mann integrability of u(t) implies that (with probability 1) 


(2.2.1) x(t) — 2(0) = lim > u(tj/n)t/n. 
no j= 

Now the n-variate distribution of the variables summed is Gaussian. Then the 
sum is Gaussian, so the distribution of x(t) — (0) is also Gaussian, if it can be 
shown that the variance of x(t) — 2(0) is positive. The distribution of 
x(t) — 2x(0) is thus completely determined by its first two moments, which we 
proceed to calculate. We shall suppose, that E{u(t)} = 0, Efu(t)’} = 0. 
Then we find 


(2.2.2) E{x(t) — x(0)} = [ E{u(s)} ds = 0,’ 


and, if t > 0, 


Bila) — OF} = [ [/ Bfalsyuts’y} ae as’ 
(2.2.3) ca ’ 
= | I e Flt-*'l ds ds’ = 2 (6 — 1+ ft). 


7 By Fubini’s integration theorem, we can find the expectations under the integral sign, 
before integrating with respect to s. 











which 
ch to 
ut. the 
Mm or 
given 
ounds 
f the 


ess of 


) give 
ow, it 
2.1.1) 
rth of 
tions, 
e was 


Rie- 


n the 
an be 
on of 
sh we 
2 


3t). 


1 sign, 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 357 


The same sort of argument shows that if 4, --- ,¢, are any distinct numbers, 
the chance variables 


{x(t;) — x(0), u(t), j=l,e-+,n} 


have a 2n-variate Gaussian distribution, which can then be evaluated explicitly 
by finding the first and second moments. For example, the following equations 
determine the bivariate distribution of z(#) — x(0), u(t), (¢ > 0): 


2 


Bilal) — (Out) = [/ Bfaltuls)} ds = a1 — 6%, 





(2.2.4) E{x(t) — x(0)} = 0, 
2 

E{{x(t) — 2(0)}"} = a (ee —1+4 6), Ef{u(t)}}=0, Ef{u(t)} =o. 
Thus the bivariate density of x(t) — 2(0), u(t) is Gaussian, with common 
mean 0, and variances (205/6°)(e *' — 1 + £t), 0 , respectively, and correlation 
coefficient 

1—¢* 

(2.2.5) ; 


Hew — 1 + Bp 
It is to be expected that if s: < so S tt < t2, x(s2) — x(s:) and x(t) — x(t) 


become independent as 4; — ©. In fact, these two normally distributed vari- 
ables have correlation coefficient. 


(&* ee ef) (¢ Fh -_ e Pt) . 
2(e POI) — 1 + Bq — 81)) (ER — 1 + Bla— th)? 





(2.2.6) 


which goes to 0 when & and & become infinite. 

If in this discussion only the conditional distribution functions are wanted, 
for u(0) = uo, for example, two procedures are possible. Setting u(0) = wo 
instead of using the initial distribution we have used above, carrying out the 
same type calculations as above, now would give the desired conditional proba- 
bilities. Or the conditional distributions could be calculated from the distribu- 
tions just derived, since the conditional distributions of a multivariate Gaussian 
distribution are easily found. Theorems 1.2, 1.3 and 2.1 hold no matter what 
initial distribution is assigned to u(0). 

‘ Finally, there is one more fact which we shall need in the next section. Define 

(t) by 


(2.2.7) B(t) = Blx(t) — 2(0)] + u(t) — u(0). 


Then B(t) has for each ¢ a Gaussian distribution, with mean 0. Evidently the 
distribution of B(s + #) — B(s) is independent of s. It is Gaussian, with 
mean 0, and the variance is easily calculated to be 2038 |t|. Moreover, if 
i<& Sh <b, 





358 J. L. DOOB 


(2.2.8) E{(B(t) — B(h)|[B(s2) — B(s)]} = 0. 


Thus the B(é)-process is again the Einstein-Smoluchowski process, 


3. Derivation of the velocity distribution using the Langevin equation 


Ornstein and Uhlenbeck base their investigation on the Langevin equation 


(3.1) au = — Bult) + ACO, 

which is simply Newton’s law of motion applied to a particle, after dividing 
through by the mass. The first term on the right is due to the frictional re- 
sistance or its analogue, which is supposed proportional to the velocity. The 
second term represents the random forces (molecular impacts). Probability 
hypotheses are imposed on the A(é), including relations between A(t) and u(t), 
to determine the u(t) distribution. Unfortunately this w(¢) distribution (Case B 
of Theorem 1.1), as we have seen, has the property that the velocity function 
has no time derivative. Then the solution can hardly satisfy (3.1). 

Bernstein ((2) p. 361) replaces (3.1) by a finite difference equation: 


(3.2) A = = —BAEn + an, nm = 1,2, «+s, 
Here &, &,-+-- is a sequence of chance variables, Aé&, = &n41 — & ete., and 
a, &, °°: is a given sequence of mutually independent chance variables. If 
we think of £; as the analogue of x«(jAt), the correspondence between (3.2) and 
(3.1) is clear. The equations of (3.2) determine definite distributions for the 
£;in terms of those of the a;. Bernstein shows that as At — 0 the distribution 
of Aé,/At (~ Aaz/At) becomes the u(t) distribution we have been discussing, if 
suitable hypotheses are made on the a;. This approach is essentially different 
from that of Ornstein and Uhlenbeck in that Bernstein, as he states explicitly 
((1) pp. 5, 6) is not writing a difference equation in the displacement functions 
x(t) themselves: (3.2) determines distributions only, and these are approximated 
by the limiting distributions described in Theorem 1.1 Case B. 

In our treatment, we shall replace the Langevin equation by a formalized 
differential equation for the velocity function u(t). This equation is to be 
exact, not merely asymptotically true. The equation will be perfectly proper 
mathematically, so that solution by ordinary methods will provide all the infor- 
mation relevant to the desired distributions, and solution of more general prob- 
lems, involving external forces, will require no special methods. 

The problem is to find a proper stochastic analogue of the Langevin equation, 
remembering that we do not expect u’(t) to exist. We write the equation in 
the following form: 


(3.3) du(t) = —Bu(t) dt + dB(t), 


and try to give these differentials a suitable interpretation. We shall suppose 








ion 


lation 


ividing 
mal re- 

The 
ability 
id u(t), 
Case B 
inction 


2, Oe, 


e., and 
es. If 
2) and 
for the 
bution 
sing, if 
fferent 
licitly 
ictions 
mated 


1alized 

to be 
proper 
-infor- 
_prob- 


ation, 
ion in 


Ippose 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 359 


that the B(t)-process is a differential process: that is, if  < +--+ < t,, we 
suppose that 


B(t) eal Bh), st , B(tn) ans B(tn-1) 


are mutually independent chance variables. We also suppose temporal homo- 
geneity, that is that the distribution of B(s + t) — B(s) is independent of s. 
The physical meaning of these hypotheses is clear, and they will be justified 
further below. Equation (3.3) can be interpreted roughly in terms of small 
changes in momentum. An important particular case is that in which the 
second moments of the B(t)-process are finite: 


(3.4) o(t) = E{[B(s + t) — B(s)}} < @. 


The first moment E{B(s + t) — B(s)} then exists. If this first moment 
vanishes, o (t) satisfies the functional equation 


o(s +t) = o(s) + o' (2). 


2 


Then o’(¢) must be proportional to ¢: o°(t) = to”. If f(t) is continuous, 


(3.5) [ f(t) dB(t) 


has been defined under these hypotheses (Wiener (18), pp. 151-157, Doob (3), 
pp. 131-134), even though the functions B(t) are known not to be of bounded 
variation. The definition makes all the formal processes correct. For example, 
if f’(t) exists and is continuous, 

\b 


36) fs aBe = sR — BON| - f [BH - BOZO a? 





with probability 1. The usual Riemann-Stieltjes sums converge to (3.5) in the 
mean. Moreover 


E | H(t) Be} = @ 


ef [so awe] [ co asco]} = & f soot a 


Now it can be shown even without the hypothesis of the finiteness of the second 
moment in (3.4) that the formal integral in (3.5) can be defined, and will satisfy 
(3.6). The form of the characteristic function of B(s + t) — B(s) has been 
derived by Lévy ((14) Chapter VII) and using this it is easy to prove that the 





* We never write B(t) alone in an equation, since strictly speaking only differences like 
B(t) — B(0) are defined. It is unnecessary to define B(0) itself, although for convenience 
it can be taken identically 0, without affecting any of the equations to be used. Differential 
ag have been discussed in detail by Lévy ((12), (13), (14) Chapter VII) and Doob 

3). 





360 J. L. DOOB 


usual Riemann-Stieltjes sums for the integral (3.5) converge in probability, 
The integral is defined as the limit, and (3.6) is readily verified. On the other 
hand, (3.7) cannot be expected to hold, since if f(t) = 1 the integral becomes 
B(b) — B(a), and we have not supposed that the expectation of this difference 
is finite. The special case in which the second moment is finite is the only 
important one for the purposes of this section, but less restrictive conditions will 
be needed in §5. We shall justify later the assumption that the B(t) process 
is a differential process. 

We shall interpret an equation in differentials like (3.3) to mean the truth 
(with probability 1, that is for almost all functions u(é)) of 


(3.8) [ ro ay = -6 [ sonar [90 aBo 


for all a, b, whenever f is a continuous function. Here the first two integrals 
are to be defined as the limits (in probability) of the usual Riemann or Riemann- 
Stieltjes sums. Equation (2.2.7) implies 


[ 10 du(t) -a | f(t) ax(t) + fs) dB(t) 


(3.9) 5 b 
=-s | sQuoact f so aBo. 


Thus (3.3) holds for the u(t) of the O. U. distribution if the B(t) is defined by 
(2.2.7). Moreover (2.2.7) with B(t) replaced by B(t) — B(O) is an immediate 
consequence of (3.3). In this case, B(t) has the property that the differences 
B(s + t) — B(s) have finite second moments and even Gaussian distributions, 
but we are not making either assumption in solving (3.3). 

If (3.3) is true, then (with probability 1) 


t t t 
(3.10) &* duls) = <p | dP uls) de + |  dB(r), 
0 0 0 
which implies, since integration by parts is applicable, 
t 
(3.11) u(t) = u(o)e* + &* | & aB(r) 
0 
for all t, with probability 1. Conversely suppose that u(t) is defined by (3.11). 
Since B(t) is known to be continuous in ¢ except for non-oscillatory discon- 
tinuities (jumps) (Lévy (12) pp. 359-364, (13); Doob (3), pp. 134-138), the 


same must be true of the right side of (3.11), and therefore of u(é). Then u(?) 
is Riemann integrable with probability 1. Moreover 


b t b 
(3.12) / fi) * d, I & dB(r) = / f(t) dB(t), 


so that from (3.11) 








ability, 
e other 
ecomes 
ference 
le only 
ns will 
process 


> truth 


tegrals 
mann- 


ied by 
ediate 
rences 
itions, 


3.11). 
iscon- 
), the 
n u(t) 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 361 


b b 
(3.13) [ soe ale wi) — wy = | f aB., 
proving incidentally that the left side exists. The left side can be simplified to 


(3.14) af seou(t) ae + f se) auto) 


and putting this into (3.13) we find that (3.8) is satisfied. Then (3.11) furnishes 
the complete solution of (3.3) under the stated conditions. We stress again 
that although (3.11) implies strong connections between the u(¢) and B(t) proc- 
esses, we have made no such hypothesis in the derivation not implicit in (3.3). 
Lévy ((14) pp. 166-167) has shown that the only differential processes whose 
path functions B(é) — B(O) do not have jumps have the property that the 
distribution of B(t) — B(O) is Gaussian. Then it is only in this case, which will 
lead to the O. U. process, that w(t) will not have jumps. 

The term Bu(t) in the Langevin equation is supposed to account for the total 
frictional effect, including the Doppler friction, caused by the fact that more 
impacts decelerate than accelerate the motion of a moving particle. The term 
A(t) in (3.1) or dB(t) in (8.3) represents the “purely random’’ impulses, that is, 
the residual effect after the frictional effect has been subtracted out. One idea 
running through any treatment of the Langevin equation is that this term or, 
sometimes, x(t) itself, is independent of the given velocity at any time. This 
hypothesis goes back to Langevin, and has caused much controversy. We shall 
make the hypothesis only to the following extent. The chance variable u(0) 
will be given various initial distributions, but will always be made independent 
of the B(t)-process fort = 0. This means that if 0 < ti < --- < ¢t, the chance 
variable u(0) is supposed independent of the set of chance variables 


{ B(t541) 7. Bits), j=l,er-,n- 1}. 


We shall describe the above hypothesis in the following physical terms: the 
imitial velocity u(0) is independent of later residual random impacts. It would 
be a serious drawback to the whole treatment if when u(0) is so chosen u(to) 
for each f) > O were not independent of the B(é)-process for t = t, that is if 
u(t) were not independent of later residual random impacts for all &. We 
can prove, however, the following statement, which incidentally justifies our 
hypothesis that the B(t)-process is a differential process. Let the B(t) process 
be a differential process, and define u(t) by (3.11). If the chance variable u(0) is 
independent of the B(t)-process for t = 0, then u(to) will be independent of the 
B(t)-process for t = to , for all t) > 0. Conversely suppose only that the B(t)-process 
is regular enough that the integral (3.5) can be defined as the limit in probability 
of the usual sums, and that (3.6) is true. Then if u(t) is defined by (3.11), and 
if choosing u(0) independent of the B(t) process for t = 0 implies that u(t) will be 
independent of the B(t)-process for t = t , for all t) > 0, then the B(t)-process is a 
differential process. 








362 J. L. DOOB 


Proor. Let the B(t)-process be a differential process, define u(t) by (3.11) 
and let u(0) be independent of the B(t)-process for t 2 0. Then from (3.11) 
with ¢ = &, u(t) involves only u(0) and the B(t)-process for ¢ S &. Then 
u(to) is independent of the B(t)-process for t 2 t& because the B(é)-process is a 
differential one, with differences involving t-values beyond & independent of 
those involving ¢-values before & . Conversely suppose that choosing w(0) inde- 
pendent of the B(t)-process for t = 0 implies that w(t) will be independent of 
the B(t)-process for t = t, for all 4 > 0. Then if w(O) is so chosen, 


to 
u(0) + [ & dB(r) 
0 
and therefore 
to 
[ & dB(r) 
0 


are independent of the B(t)-process fort = f&. This fact implies that the pre- 
ceding integral determines a differential process, that is, if 4 < +--+ < t,, the 
integrals 


tj+1 
é* dB(r) 
tj 


are mutually independent. Then (applying this fact to subintervals of the 
intervals (¢; , ¢;41) 


tj+1 t 
/ = a. | é* dB(z), | aad li has 
tj ‘4 
are mutually independent, and these repeated integrals are simply 
B(tj41) — B(ts) j=l,ere,n—l. 


The latter differences are therefore mutually independent, as was to be proved. 

We shall need the following lemma. 

Lemma 3. Suppose that a < 1, and let xm, 2, --- be mutually independent 
chance variables with a common distribution function. If there is a chance variable 
y with a Gaussian distribution such that the distribution function of > 7) a” ’2; 
approaches that of yas n — ~, then the x; have Gaussian distributions. 

Many of the hypotheses of the lemma are unnecessary, but its statement is 
general enough for our purposes, and the proof will apply to a situation to be 
discussed in §5, where the distribution of y will not be Gaussian. The hypothe- 
ses imply that the distribution of >>* a’x; approaches that of a” ‘y as n — ©. 
If y(t) is the characteristic function of x, and y(t) that of y, writing Di ae; 
in the form az, + >>} a’x; shows that 


v(t) = ¢g(at)-Y(at). 


Solving for ¢ we find that it is the characteristic function of a Gaussian distribu- 
tion, as was to be proved. 











3.11) 
3.11) 
Chen 
isa 
it of 
nde- 
it of 


pre- 
the 


the 


BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 363 


In the physical picture under discussion, further conditions on the solution of 
(3.3) are known. In fact the Brownian movement is simply a visible example 
of molecular or near molecular movement. The general principles of such move- 
ments are therefore applicable, and the principle of equipartition of energy leads 
to the Maxwell distribution of velocities. Let k be the Boltzmann constant, 
and 7 the absolute temperature. We can formulate the significance of the 
Maxwell distribution (as much as we shall need it) as follows. 

M,. Tendency towards the Maxwell distribution. Whatever the initial distri- 
bution of u(0), the transition probabilities have the property that when t > 
the distribution function of u(t) converges to the Gaussian distribution function 
with mean 0 and variance k7'/m. (Here m is the mass of the moving particle.) 

M.. Stability of the Maxwell distribution. If u(0) is independent of later 
residual random impacts, and if it has the Gaussian distribution described in 
M,, u(t) will have this same distribution for every positive t. 

These two statements are closely related, but neither apparently can be de- 
duced from the other without further assumptions. Since these principles act 
the part of a deus ex machina in a discussion of the Langevin equation, we shall 
use them as little as possible. It will usually be sufficient to use a weakened 
form of M, : 

M;. There is an initial distribution of u(0), such that the transition proba- 
bilities have the property that when t — o the distribution function of u(t) 
converges to the Gaussian distribution function with mean 0 and variance k7'/m. 
It is understood here as before that u(0) is to be independent of later residual 
random impacts. 

Conditions M,; and M; restrict the possibilities for the B(t)-process. In fact 
suppose that condition M; is satisfied. Then (3.11) shows that 


t 
e Ft [ e* dB(r) 
fy 


is nearly Gaussian for large ¢, with mean 0 and variance kT/m. We write this 
integral as a sum, replacing t by nt: 





nt n—1 
(3.15) ad e* dB(r) = De z,, 
0 7 
where 
(i+) t 
(3.16) = &) dB(r). 


jt 
Since the B(é)-process is a differential process, and is temporally homogeneous, 
the x; are mutually independent, with identical distributions. According to 
the lemma, the right side of (3.15) cannot become Gaussian for large ¢ unless 
the distribution of a; is Gaussian. Thus, since ¢ is arbitrary in the above 
discussion, 


] * &* dB(r) 








PO EE: eg oe 


364 J. L. DOOB 


has a Gaussian distribution for all s, ¢. Since the chance variables 


(j+1) t/n 
(3.17) | ’ e* dB(r), j= l,---,s 
jt/n 
are mutually independent and Gaussian, the chance variable 
ae (j+1) t/n 
(3.18) +P has / e* dB(r) 
0 jt/n 


also has a Gaussian distribution. When n becomes infinite, (3.18) becomes 
B(t) — B(0), with probability 1. The latter difference thus has a Gaussian 
distribution, with mean 0. The B(t)-process therefore has finite second mo- 
ments o°(t) = to” as defined in (3.4). According to (3.7) the last term in (3.11), 
which we now know has a Gaussian distribution, has mean 0 and variance 
o(1 — e *)/28. Then u(t) — e ‘u(0) has this same distribution. The vari- 
ance becomes o /28 when t > ~, and therefore, according to Mi, o = 26kT/m. 
Thus condition M; completely determines the B(é)-process. We show next that 
condition Mz, determines this same B(t)-process. In fact suppose condition M, 
is true, and assign to u(0) the distribution of that condition. Then w(0) is inde- 
pendent of the integral in (3.11), and in (3.11), u(t) (which has a Gaussian 
distribution, according to condition M2) is expressed as the sum of two inde- 
pendent chance variables, of which the first is Gaussian. The characteristic 
function of the second is the quotient of the characteristic functions of two 
Gaussian distributions, and is therefore the characteristic function of a Gaussian 
distribution. Thus the expression 


(3.19) i? I & dB(r) 


has a Gaussian distribution for all ¢, and this implies, as above, that B(t) — B(0) 
has a Gaussian distribution, with variance ot. The variances on the right side 
of (3.11) add up to that on the left, giving an equation for o°: 
(3.20) gg: et Sn 

m m 26 
Then o = 26k7'/m as above. 

We can now finally derive the O. U. velocity process as the solution of the 
Langevin equation. Suppose the B(t)-process is the one derived in the preceding 
paragraphs, and choose the chance variable u(0) to be independent of the 
B(t)-process for ¢ = 0. Then u(0) is independent of the integral in (3.11), and 
this means that the conditional distribution of u(é) for u(0) = uw is Gaussian, 


with mean 0 and variance kT(1 — e°')/m. Moreover, (3.11) implies 


s+t 
(3.21) uls + t) = uls)e F! + Fr / e”” dB(r). 


As we have seen, u(s) is independent of the B(t)-process as far as it appears 
in (3.21) and therefore is independent of the integral. Thus the transition 





p 
th 


> -« 65 £4 


> —_ hes fe §=6—e 








“+ - 


omes 
ssian 
mo- 
}.11), 
lance 
vari- 
'/m. 
that 
n M 9 
nde- 
ssian 
nde- 
‘istic 
two 
‘sian 


B(0) 
side 


the 
ling 
the 
and 
jan, 


pars 
tion 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 365 


probabilities from s to s + ¢ are the same as those from 0 to ¢, which are pre- 
cisely those of the O. U. process. Incidentally it follows that the full condition 
M, is satisfied. Finally, if w(Q) is not only supposed independent of the B(t)- 
process, for ¢ 2 0, but also is supposed to have a Gaussian distribution with 
mean 0 and variance kT'/m, the same will be true of w(t) (as can be calculated 
from (3.11)) and condition Me is thus satisfied. We can summarize all our 
results as follows. 

TurorEM 3. Let the B(t)-process be a temporally homogeneous differential 
process. Then (3.11) furnishes the solution of (3.3). The following conditions on 
the solution are equivalent. 

(i) The solution satisfies condition Mj . 

(ii) The solution satisfies condition M, . 
(iii) The solution satisfies condition Me . 
(iv) B(t) — B(O) has a Gaussian distribution, with mean 0 and variance 
ot = (28kT/m. 

If the above conditions are satisfied, u(t) — e * u(0) will have a Gaussian distri- 
bution with mean 0 and variance kT(1 — e**)/m; if u(O) is independent of the 
Bit)-process for t = 0, u(s) ts independent of the B(t)-process for t = s for all 
s > 0, and the transition probabilities of the u(t)-process are those of the O. U. 
velocity process. If in addition u(0) has the Gaussian distribution with mean 0 
and variance kT/m, the u(t)-process becomes the O.U. process, with m = 0, 
o = kT/m. 

The Langevin equation gives a physical interpretation to every property of 
the O. U. process. It is interesting to verify that as h — 0 the correlation 
coefficient of the pair B(s + h) — B(s), u(t) (any s, ¢) goes to 0. In this sense 
then, dB(s), the effect of the residual random impacts at time s, is independent 
of the velocity at any particular time ¢. Since in (3.11) u(t) is written in terms 
of the B(t)-process, u(t) is of course not independent of this process. 

We have written u(é) in terms of the B(t)-process. It is easy to write x(t) 
in terms of the B(¢) process by combining (2.1) with (8.11): 


—Bt 


—Bt t 

(322) a(t) = 2(0) +2 —* uo) +2 [u- *aBe), 
B B Jo 

Instead of finding the distributions of the displacement and velocity processes, 

and their correlations, as at the beginning of the paper, we could easily derive 

the desired results using (3.11) and (3.22). The various expectations can be 

calculated using (3.7). 

In physical applications, the correlation function E{u(s)u(s + ¢)} is sometimes 
wanted as a time average. Now the transformation S, taking B(¢) — B(0) into 
Bit + h) — B(h) preserves the B(t) probability relations (temporal homo- 
geneity), and the family of transformations {.S,} is well known to be metrically 
transitive.’ Then applying the ergodic theorem to the function u(0)u(h), con- 
sidered as a function of the B(t), we find that 


renee MRAP i Ge Sh 
* Cf. for example Doob, (3) p. 125. 








Ra ae eat 


366 J. L. DOOB 


t 
(3.23) lim : | u(s)u(s + h) ds = Efu(O)u(h)} = — <r, 
with probability 1, that is for almost all functions u(t). The ergodic theorem 
was applied to the B(é)-process in essentially this way by Wiener ((18) p. 169) 
who has been interested in the harmonic analysis of functions like the u(t) dis- 
cussed here. The work of this paper verifies in this particular case the impor- 
tance Wiener gave to the functions of the B(t)-process of the type (3.5). 
There is no difficulty in extending the above results to bound particles. For 
example, the Langevin equation of the harmonically bound particle is 


du 


(3.24) —~ —Bu—wa«+ A(Z), 
which in our treatment becomes 
(3.25) du = —Budt — wadt + dB. 


The usual methods of solving the differential equation (3.24) are still applicable 
to (3.25) and again the distribution of u turns out to be Gaussian.” The 
distribution of displacements is then obtained as above. 


4. The B(t)-impact process 


When the B(t)-process and the initial conditions on u(Q) are given, the solu- 
tion u(t) is determined by (3.11). Conversely if the solution u(t) is known, 
B(t) is determined by the equation 


(4.1) B(t) — B(O) = a[ u(s) ds + u(t) — u(0) 


which is derived immediately from (3.3). The O. U. velocity distribution for 
the u(t)-process can therefore be given only by the B(t)-process described in §3. 
We shall investigate the possibility that a different choice of the B(t)-process 
might have led to a different velocity process compatible with the known 
physical conditions like M; and M2. If we suppose that u(0) can be chosen so 
that the velocity at each moment is independent of subsequent residual random 
impacts, then we have seen that the B(t)-process must be differential, and is 
then uniquely determined by conditions M; or Me . Any velocity process other 
than the O. U. process satisfying the Langevin equation and M; or M: would 
therefore imply dependence between velocity and later residual impacts. This 
is really another way of saying that the frictional resistance cannot be con- 
sidered as proportional to the velocity. Before going further we put a condi- 
tion going back to Maxwell in its modern setting. We formulate a hypothesis 
M; as follows. 

M;. In two or more dimensions (using any orthogonal axes) the velocity 
components are mutually independent. 





10 Cf. Ornstein and Wijk (16) and Wijk (17). The B(t, A) used in these papers corre- 
sponds formally to our dB(t). The difference is that it is possible to give a precise descrip- 
tion of the B(t)-distribution. 





fl 








eorem 
. 169) 
t) dis- 
mpor- 


For 


cable 
The 


solu- 
own, 


1 for 
1 §3. 
CESS 
own 
n so 
dom 
id is 
ther 
ould 
This 
con- 
ndi- 
esis 


city 


yrre- 
rip- 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 367 


In conjunction with the following lemma, due to Kag ((8) p. 278), hypothesis 
M; implies that all quantities linear in the displacement or velocity functions 
have Gaussian distributions. 

Lemma. Let (21, yi), °** » (@n, Yn) be 2n chance variables with the property 
that the sels of chance variables 


{2;cos 6+ y; sin 0,7 = 1, +++, n}{—z;sin 9 + y; cos 0,7 = 1, +++, n} 


are mutually independent for each value of 0. Then (x1, +++ , tn) have an n-variate 
Gaussian distribution or a singular Gaussian distribution. 

We can combine the Maxwell hypotheses to obtain another justification of 
the O. U. velocity process. 

TurorEM 4. Let the B(t)-process be any process such that the distribution of 
B(t) — B(t) or of any quantity depending on such differences is unaffected by 
translations of the t-axis, and that the integral (3.5) can be defined as the limit in 
probability of the usual sums, with (3.6) valid. Then if u(t) is defined by (3.11), 
and if conditions Mz and Ms; are satisfied, the B(t)-process must be precisely that 
finally obtained in §3, leading to the O. U. velocity process. 

Suppose that condition Msg is satisfied, and let w(0) be fixed as in that condi- 
tion. Just as in §3, (3.11) then implies that the integral 


B*(t) = I * * aB(n) 


has a Gaussian distribution with mean 0 and variance (k7'/m)(e"’ — 1). If 
condition M; is also satisfied, B*(t2) — B*(t), and more generally any finite set 
of such differences, has a one or more dimensional Gaussian distribution. Using 
the fact that the distribution of e “[B*(s + t) — B*(s)] is the same as that of 
B*(t), in evaluating the expectations in the following equation 


(4.2) E{B*(s + #)?} = E{[B*(s) + [B*(s + t) — B*(s)]I}, 


we find that B*(s) = B*(s) — B*(0) and B*(s + t) — B*(s) are uncorrelated. 
These two variables are therefore independent. Going further, similar calcula- 
tions show that any differences B*(t.) — B*(t:), B*(s2) — B*(s:) with 0 S i < 
%& St, < t are independent. Using the fact (derived from condition Ms) that 
any finite set of differences has a multivariate Gaussian distribution, the B*(t)- 
process is thus a differential process. This means, by a method we have used 
above, that the B(t)-process is a differential process, leading to the O. U. velocity 
distribution, because condition Mg is satisfied. 

It is easily seen from counterexamples that Theorem 4 is no longer correct if 
condition M;, is supposed instead of condition Me . 


5. Velocity processes not subject to Maxwell’s laws 


In all the above work the role of the Maxwell velocity distribution has been 
fundamental. In certain studies, however, other distributions play a somewhat 


' The result is stated slightly incorrectly by Kag. 





366 J. L. DOOB 


(3.23) lim ; j u(s)u(s + h) ds = Ef{u(O)u(h)} = — oi, 


t—>0 


with probability 1, that is for almost all functions u(t). The ergodic theorem 
was applied to the B(t)-process in essentially this way by Wiener ((18) p. 169) 
who has been interested in the harmonic analysis of functions like the w(t) dis- 
cussed here. The work of this paper verifies in this particular case the impor- 
tance Wiener gave to the functions of the B(t)-process of the type (3.5). 

There is no difficulty in extending the above results to bound particles. For 
example, the Langevin equation of the harmonically bound particle is 

du 


(3.24) =~ —Bu—wax+ A(d), 


which in our treatment becomes 
(3.25) du = —Budt — wxdt + dB. 


The usual methods of solving the differential equation (3.24) are still applicable 
to (3.25) and again the distribution of u turns out to be Gaussian.” The 
distribution of displacements is then obtained as above. 


4. The B(t)-impact process 


When the B(t)-process and the initial conditions on u(0) are given, the solu- 
tion u(t) is determined by (3.11). Conversely if the solution u(t) is known, 
B(t) is determined by the equation 


(4.1) Bit) — B(O) = ef u(s) ds + u(t) — u(0) 


which is derived immediately from (3.3). The O. U. velocity distribution for 
the u(t)-process can therefore be given only by the B(é)-process described in §3. 
We shall investigate the possibility that a different choice of the B(t)-process 
might have led to a different velocity process compatible with the known 
physical conditions like M; and M2. If we suppose that u(0) can be chosen so 
that the velocity at each moment is independent of subsequent residual random 
impacts, then we have seen that the B(t)-process must be differential, and is 
then uniquely determined by conditions M; or Me. Any velocity process other 
than the O. U. process satisfying the Langevin equation and M; or M2 would 
therefore imply dependence between velocity and later residual impacts. This 
is really another way of saying that the frictional resistance cannot be con- 
sidered as proportional to the velocity. Before going further we put a condi- 
tion going back to Maxwell in its modern setting. We formulate a hypothesis 
- Ms as follows. 

M;. In two or more dimensions (using any orthogonal axes) the velocity 
components are mutually independent. 





10 Cf. Ornstein and Wijk (16) and Wijk (17). The B(t, A) used in these papers corre- 
sponds formally to our dB(t). The difference is that it is possible to give a precise descrip- 
tion of the B(t)-distribution. 








eorem 
. 169) 
t) dis- 
mpor- 


For 


cable 
The 


solu- 
own, 


1 for 
n §3. 
cess 
own 
nN so 
dom 
id is 
ther 
ould 
This 
con- 
ndi- 
1esis 


city 


orre- 
crip- 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 367 


In conjunction with the following lemma, due to Kag ((8) p. 278), hypothesis 
M; implies that all quantities linear in the displacement or velocity functions 
have Gaussian distributions. 

Lemma.” Let (a1, yr), *** » (&ny Yn) be 2n chance variables with the property 
that the sels of chance variables 


{z;cos 6+ y;sin 0,7 = 1, +--+, n}{—z;sin 6 + y; cos 6,7 = 1, +++, n} 


are mutually independent for each value of 0. Then (x, +++ , tn) have an n-variate 
Gaussian distribution or a singular Gaussian distribution. 

We can combine the Maxwell hypotheses to obtain another justification of 
the O. U. velocity process. 

TurorEM 4. Let the B(t)-process be any process such that the distribution of 
B(t) — B(t) or of any quantity depending on such differences is unaffected by 
translations of the t-axis, and that the integral (3.5) can be defined as the limit in 
probability of the usual sums, with (3.6) valid. Then if u(t) is defined by (3.11), 
and if conditions Mz and Ms; are satisfied, the B(t)-process must be precisely that 
finally obtained in §3, leading to the O. U. velocity process. 

Suppose that condition Mz is satisfied, and let u(0) be fixed as in that condi- 
tion. Just as in §3, (3.11) then implies that the integral 


B*(t) = I * * aB(n) 


has a Gaussian distribution with mean 0 and variance (k7T'/m)(e*’ — 1). If 
condition M; is also satisfied, B*(t2) — B*(t:), and more generally any finite set 
of such differences, has a one or more dimensional Gaussian distribution. Using 
the fact that the distribution of e "[B*(s + t) — B*(s)] is the same as that of 
B*(t), in evaluating the expectations in the following equation 


(4.2) E{B*(s + 0)"} = E{[B*(s) + [B*(s + t) — B*(s)]F}, 


we find that B*(s) = B*(s) — B*(0) and B*(s + t) — B*(s) are uncorrelated. 
These two variables are therefore independent. Going further, similar calcula- 
tions show that any differences B*(t:) — B*(t:), B*(se2) — B*(s:) withO S si < 
8 St; < hare independent. Using the fact (derived from condition Ms) that 
any finite set of differences has a multivariate Gaussian distribution, the B*(t)- 
process is thus a differential process. This means, by a method we have used 
above, that the B(t)-process is a differential process, leading to the O. U. velocity 
distribution, because condition My is satisfied. 

It is easily seen from coun amples that Theorem 4 is no longer correct if 
condition M; is supposed instead of condition M2. 


5. Velocity processes not subject to Maxwell’s laws 


In all the above work the role of the Maxwell velocity distribution has been 


fundamental. In certain studies, however, other distributions play a somewhat 
—_—_—-___—_—_—_____. 


"' The result is stated slightly incorrectly by Kag. 











i a nn 


368 J. L. DOOB 


analogous role.” It is interesting to note that the Langevin equation can be 
solved to give a distribution whose transition probabilities are asymptotically 
any of the symmetric stable distributions classified by Lévy ((14) §30, §56, §57). 
Such a distribution has characteristic function 


22/7 
e viel 


where a; is a positive parameter and 0 < y S 2. The Gaussian distribution is 
obtained when y = 2. The parameter o plays the role of the variance, although 
the second moment is never finite when y < 2. The velocity process we shall 
derive will be called the O. U. (y) process. It is the O. U. process when y = 2. 
The O. U. (y) process can be described as follows. 

1. The process is temporally homogeneous, that is trz .lations of the t-axis 
do not affect the probability distributions. 

2. The process is a Markoff process. 

3. For each fixed t, u(t) has a symmetric stable distribution with parameter 
value 0) , exponent y. The conditional distribution of u(s + #) for u(s) = w 
is the stable distribution symmetric about we’, with parameter value 
o3(1 — e 7*'*') and exponent y. 

We can obtain this process as a solution of the Langevin equation by choosing 
the B(t)-process properly. In fact, let the B(t)-process be the temporally 
homogeneous differential process in which B(s + t) — B(s) has a symmetric 
stable distribution with exponent y and parameter value ot. Let u(t) be the 
corresponding solution of the Langevin equation, given by (3.11). If y is the 
sum of two independent chance variables with stable symmetric distributions, 
having parameter values o; , 02 , and with the same exponent 7 then y also has a 
symmetric stable distribution, with the same exponent, y, and with parameter 
value oj + 03. From this fact it is simple to check that the integral (3.5) in 
the present case has a symmetric stable distribution with exponent y and 
parameter value. 


[ "|" dt 


If u(0) is given a symmetric stable distribution independent of the B(t)-process 
for t = 0, with parameter value o’/7§, the distribution of u(é) can be calculated, 
using characteristic functions, and is found to be symmetric and stable, with 
exponent y and parameter value o’/yB. The u(t) thus defined determines an 
O. U. (y) process, with the above three properties, setting 05 = o /y8. 

We shall not spend any time on the details of the analysis of the O. U. (7) 
process, since the work runs parallel to that for the case y = 2, already discussed. 
There are, however, a few essential differences. If v(t) is determined by the 
equation 


1 
(1.2.1) v(t) = Hu( 4 ) t>0o0 
( ) 7B og ’ > ? 





12 Cf. Holtzmark (7). 








can be 
tically 
, §57), 


tion is 
hough 
» shall 
ag 


t-axis 


meter 
= UW 
value 


osing 
rally 
etric 
e the 
s the 
jlons, 
has a 
neter 
5) in 
and 


CESS 
ited, 
with 
s an 


sed. 
the 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 369 


the v(t) process can be analyzed using (3.11). The v(¢)-process has the same 
distribution as the B(t)-process just described. The continuity properties of the 
velocity process can now be derived from those of the v(t)-process, which are 
known. Wheny < 2, the velocity function u(t) is no longer a continuous func- 
tion of t with probability 1, but is certain to have discontinuities. These dis- 
continuities are however non-oscillatory (jumps). We omit the details of the 
analogue of Theorem 1.3. Theorem 2.1 is still true if y 2 1. The considera- 
tions of §3 have their obvious counterparts here. Lemma 3 played an essential 
role, but its statement and proof are correct if the variable y of the lemma is 
supposed to have a symmetric stable distribution and if the conclusion is that 
the x; have a symmetri¢ stable distribution with the same exponent as y. 


UNIVERSITY OF ILLINOIS 
AND 
INSTITUTE FOR ADVANCED Stupy 


BIBLIOGRAPHY 
1. 8S. BerNstEIN, Comptes Rendus de l’Académie des Sciences de ]1’URSS, N.S. (1934) 
pp. 1-9. 
2. 8. BerNsTEIN, Comptes Rendus de |’Académie des Sciences de l’URSS, N.S. (1934), 
pp. 361-364. 


J. L. Doos, Transactions of the American Mathematical Society 42 (1937), pp. 107-140. 

J. L. Doos, Duke Mathematical Journal 6 (1940), pp. 290-306. 

J. L. Doos, Transactions of the American Mathematical Society 47 (1940), pp. 455-486. 

G. L. pe Haas-Lorentz, Die Brownsche Bewegung und einige verwandte Erscheinungen, 
Braunschweig (1913). 

. J. Hotrzmark, Annalen der Physik 58 (1919), pp. 577-630. 

. M. Kag, American Journal of Mathematics 61 (1939), pp. 726-728. 

. A. KaintcHINnE, Mathematische Annalen 109 (1934), pp. 604-615. 

10. A. KuintcH1ne, Ergebnisse der Mathematik 4 No. 3. 

ll. G. Krurxow, Physikalische Zeitschrift der Sowjet-Union 5 (1934), pp. 287-300 

12. P. Livy, Pisa Annali Series 2 vol. 3 (1934), pp. 337-366. 

13. P. Lévy, Pisa Annali Series 2 vol. 4 (1935), pp. 217-218. 

14. P. Livy, Théorie de UV’ addition des variables aléatoires, Paris 1937. 

15. L.S. ORNsTEIN AND G. E. UHLENBECK, Physical Review 36 (1930), pp. 823-841. 

16. L. 8. ORNSTEIN AND W. R. van Wu, Physica 1 (1934), pp. 235-254, errata p. 966. 

17. W. R. van Wisx, Physica 3 (1936), pp. 1111-1119. 

18. N. Wiener AND R. E. A. C. Pauey, Fourier Transforms in the Complex Domain, Ameri- 

can Mathematical Society Colloquium Publications Vol. XIX. 


im] 





8 For further details, ef. Lévy (14) Chapter VII. 





368 J. L. DOOB 


analogous role.” It is interesting to note -hat the Langevin equation can be 
solved to give a distribution whose transition probabilities are asymptotically 
any of the symmetric stable distributions classified by Lévy ((14) §30, §56, §57). 
Such a distribution has characteristic function 


2\2|7 
e vole 


where a is a positive parameter and 0 < y S$ 2. The Gaussian distribution is 
obtained when y = 2. The parameter o plays the role of the variance, although 
the second moment is never finite when y < 2. The velocity process we shall 
derive will be called the O. U. (y) process. It is the O. U. process when y = 2. 
The O. U. (y) process can be described as follows. 

1. The process is temporally homogeneous, that is translations of the t-axis 
do not affect the probability distributions. 

2. The process is a Markoff process. 

3. For each fixed ¢, u(t) has a symmetric stable distribution with parameter 
value 0) , exponent y. The conditional distribution of u(s + ¢) for u(s) = um 
is the stable distribution symmetric about we’, with parameter value 
o3(1 — e ”*'*') and exponent ¥. 

We can obtain this process as a solution of the Langevin equation by choosing 
the B(t)-process properly. In fact, let the B(t)-process be the temporally 
homogeneous differential process in which B(s + t) — B(s) has a symmetric 
stable distribution with exponent y and parameter value ot. Let u(t) be the 
corresponding solution of the Langevin equation, given by (3.11). If y is the 
sum of two independent chance variables with stable symmetric distributions, 
having parameter values oj , 02 , and with the same exponent y then y also has a 
symmetric stable distribution, with the same exponent, y, and with parameter 
value oj + o2. From this fact it is simple to check that the integral (3.5) in 
the present case has a symmetric stable distribution with exponent y and 
parameter value. 


[ "4 |" at 


If u(0) is given a symmetric stable distribution independent of the B(t)-process 
for t = 0, with parameter value o’/y8, the distribution of u(é) can be calculated, 
using characteristic functions, and is found to be symmetric and stable, with 
exponent y and parameter value o’/y8. The u(é) thus defined determines an 
O. U. (y) process, with the above three properties, setting 05 = o /y8. 

We shall not spend any time on the details of the analysis of the O. U. (7) 
process, since the work runs parallel to that for the case y = 2, already discussed. 
There are, however, a few essential differences. If v(t) is determined by the 
equation 

1 


(1.2.1) v(t = u(F1 ‘), t>0 
) vp 108 > 0, 





22 Cf. Holtzmark (7). 








can be 
rtically 
), §57). 


ition is 
hough 
e shall 
y=2. 


t-axis 


Meter 
value 


osing 
orally 
netric 
ve the 
is the 
tions, 
has a 
meter 
5) in 
- and 


ocess 
ated, 
with 
oS an 


. (y) 
ssed. 
r the 





BROWNIAN MOVEMENT AND STOCHASTIC EQUATIONS 369 


the v(t) process can be analyzed using (3.11). The v(¢)-process has the same 
distribution as the B(t)-process just described. The continuity properties of the 
velocity process can now be derived from those of the v(t)-process, which are 
known. When y < 2, the velocity function u(¢) is no longer a continuous func- 
tion of t with probability 1, but is certain to have discontinuities. These dis- 
continuities are however non-oscillatory (jumps). We omit the details of the 
analogue of Theorem 1.3. Theorem 2.1 is still true if y = 1. The considera- 
tions of §3 have their obvious counterparts here. Lemma 3 played an essential 
role, but its statement and proof are correct if the variable y of the lemma is 
supposed to have a symmetric stable distribution and if the conclusion is that 
the x; have a symmetri¢ stable distribution with the same exponent as y. 


UniIvERSITY OF ILLINOIS 
AND 
INSTITUTE FOR ADVANCED StTuDY 


BIBLIOGRAPHY 
1. 8. BernsTEIN, Comptes Rendus de |’Académie des Sciences de l’URSS, N.S. (1934) 
pp. 1-9. 
2. 8. BerNsTEIN, Comptes Rendus de ]’Académie des Sciences de l’URSS, N.S. (1934), 
pp. 361-364. 
3. J. L. Doon, Transactions of the American Mathematical Society 42 (1937), pp. 107-140. 
4. J. L. Doon, Duke Mathematical Journal 6 (1940), pp. 290-306. 
5. J. L. Doon, Transactions of the American Mathematical Society 47 (1940), pp. 455-486. 
6. G. L. bE Haas-Lorentz, Die Brownsche Bewegung und einige verwandte Erscheinungen, 


Braunschweig (1913). 
7. J. Hourzmark, Annalen der Physik 58 (1919), pp. 577-630. 
8. M. Kag, American Journal of Mathematics 61 (1939), pp. 726-728. 
9. A. KuinrcH1neE, Mathematische Annalen 109 (1934), pp. 604-615. 
10. A. KuintcHINne, Ergebnisse der Mathematik 4 No. 3. 
ll. G. Krurkow, Physikalische Zeitschrift der Sowjet-Union 5 (1934), pp. 287-300 
12. P. Livy, Pisa Annali Series 2 vol. 3 (1934), pp. 337-366. 
13. P. Livy, Pisa Annali Series 2 vol. 4 (1935), pp. 217-218. 
14. P. Livy, Théorie de l’ addition des variables aléatoires, Paris 1937. 
15. L. S. OnNsTEIN anp G. E. UHLENBECK, Physical Review 36 (1930), pp. 823-841. 
16. L. 8S. ORNSTEIN AND W. R. van Wu, Physica 1 (1934), pp. 235-254, errata p. 966. 
17. W. R. van Wux, Physica 3 (1936), pp. 1111-1119. 
18. N. Wrener AND R. E. A. C. Pauey, Fourier Transforms in the Complex Domain, Ameri- 
can Mathematical Society Colloquium Publications Vol. XIX. 





8 For further details, ef. Lévy (14) Chapter VII. 








ANNALS OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


A NEW HOMOLOGY THEORY 


By W. MayER 
(Received January 20, 1942) 


In the classical homology theory one considers a sequence C*,C’, «++ of addi- 
tive groups and homomorphisms C’*' — C* such that the induced homo- 
morphisms C*** — C’ are trivial, i.e. c’* is mapped into the zero of C’. In the 
present paper we propose to develop a new homology theory which also uses a 
sequence C*, C’, --- of additive groups and homomorphisms C*** > C". In this 
new theory, however, a fixed prime number p is chosen, and the induced homo- 
morphisms C**” — C’ are the ones which are trivial. These groups for p = 3 
were already considered at length, but by a differes: method, in the following 
papers: Mayer-Campbell, Generalized Homology Growys, Proc. Nat. Acad. Sci., 
U.S. A., 26, 655-656 (1940), and Generalized Homology Groups, to be published 
shortly in Revista de Matematicas y Fisica Tedrica. 


1. Simplicial systems 


We first define the new homology theory for a simplicial system i.e. a collection 
{o} of (non-oriented) simplexes such that the faces of any simplex o « D also 
belongs to Y. In addition to the simplexes of 2, henceforth called simple 
simplexes, we shall consider simplexes with repeated vertices, or generalized 
simplexes 


(1.1) (Pi'Ps* «++ P,”) 


where P; # P;, and the integers a; 2 1 indicate the multiplicity of the corre- 
sponding vertex. 

The simplex (1, 1) shall belong to = if and only if the simple simplex 
(P,P: --+ P,) belongs to 2. The dimension of the generalized simplex (1, 1) is 
defined to be a; + a2 + --- +a, — 1. 

Hereafter we use the symbol 2 to denote the extended system of all the simple 
and generalized simplexes thus obtained. | 

We now introduce, for a fixed prime number p(# 1) the group C” of n-chains, 
K", mod p. These n-chains K” are made up of simplexes p” of = of dimension 
n which may or may not be simple: 


N 
n n 
K" = » Aj pj. 
I= 


A boundary operator F is first defined for simplexes of dimension > 0 by 


v 


(1.2) F(Pf «++ P?") = >> a( Pf! --+ Peis! Pet pais... P%), 


= 


370 





f addi- 
homo- 
In the 

uses a 
In this 
homo- 
p=3 

lowing 

d. Sei., 

plished 


lection 
> also 
simple 
ralized 


corre- 


moplex 
l)is 


imple 


hains, 
nsion 








A NEW HOMOLOGY THEORY 371 


where vertices with exponent zero are crossed out, and the coefficients are re- 
duced mod p. Just as in the classical theory, the formula 


(1.3) F(d. ayp7) = Do ajF (07) 


defines a homomorphism C” — Cc” forn > 0. 
If the n-dimensional simplex (1.1) is written in the form 


(1.4) (PP. oad P41) 


where the vertices P: , Ps, «++ Pn4: need not be distinct, the formula (1.2) may 
be written 


n+l 


(1.5) F(Pa +++ Pass) = Qo (Pa ++ Passi n21l, 


where (P; --- Pn4i); is the face of (P: --- Px4:) opposite the vertex P;. De- 
note by (Pi --+ Pn41)j,---;, the face opposite the face (P;, --- P;,) and let 
F'( ) = F(F( )), --- F‘() = F(F*( )). The formula (1.5) enables us to calcu- 
late rapidly the boundary of a boundary ete. Thus 


F°(P, +++ Pays) = 2 >» (Pi +++ Pasrdizic 5 n 2 2, 
(i172) 
and in general, 
(1.6) F'(P; ss Pasi) — a! p (Pi eee Prasadic-+ ie ’ n = 1. 


Gia" da) 
. , ; n+1 ee 
where the summation (j; «++ j:) runs over all ; combinations of 1, 2, --- 
n+l. Ifn 2 p then 
(1.7) F'(P, +++ Pay) = 0", 
where 0° denotes the zero of C’. If 7 < ¢ then in general F'(P, --- P41) A e 
Thus, for m = 7 the hgmomorphism F* maps C” into the zero of C" ‘if 1 2 p. 
2. g-cycles of dimension n 


If 1 <q < min {p, n + 1}, an n-chain K” will be called an n-dimensional 
q¢-eycle (briefly a (g, n)-cycle) whenever 


(2.1) FY(K") = 0"*. 


By (1.3) the (q, n)-cycles form a subgroup Z? of C”. 
For any (n + p — q)-chain K"*?* 


(2.2) FF? -(K"*?-*)] 7_ F°(K****) a grt 


Hence the (p — q)t* boundary of an (n + p — q)-chain is always a (q, ”)-cycle. 
These (p — q)t* boundaries form a subgroup By of Z;. The difference group 


” Hy = 2; — B; 





372 W. MAYER 


will be called the g‘* n-dimensional homology group of = (briefly the (gq, n)- 
homology group). It is defined for g S n since (2.1) has meaning only in this 
case. If gq > n, so that (2.1) is not applicable, we define 


(2.3) Zr bis Cc", q > a. 


Thus every n-dimensional chain is a (q, n)-cycle when gq > n. The group B’ 
of (p — q)** boundaries is defined for every n and every q < p and is a subgroup 
of Z; ; hence the group H7 is defined by (2.3) for every n = 0, 1, 2, --- and 
every g such that 1 S q < p. 


3. Regular and degenerate chains 


We shall call a simplex o degenerate if one or more of its vertices have a multi- 
plicity greater than p — 1; otherwise it is said to be regular. A chain will be 
called regular if its simplexes are all regular and degenerate if its simplexes are all 
degenerate. Thus every chain K” can be represented uniquely as the sum of a 
regular chain K(,) and a degenerate chain K(a) 


(3.1) K" = Ki, + Kia - 


Thereby we consider the zero chain as the only chain both regular and de- 
generate. From (1.2) and (1.3) it follows that the boundary of a regular chain 
is regular and the boundary of a degenerate chain is degenerate. Hence (for 
n > 0) (3.1) implies that 


(3.2) F(K") = F(K¢») + F(K@), 
where 
(3.3) [F(K")k» = F(Kié),  [F(K)]@ = F(K@). 


Thus the simplicial system = is the “direct sum’’ of its two sub-systems 2 
(of regular chains) and Ya (of degenerate chains): 


> => Zor) + 2) . 


(3.4) THEorEM. The (q, n)-homology groups of = are isomorphic with the corre- 
sponding groups of Xr . 

Let K(,) be a (q, )-cycle of 2) ; hence also a (q, n)-cycle of 2. Let {Ki} 
and {K(,)} denote its respective homology classes in D,,) and 2. The mapping 7: 


(3.5) {Ki} oy > {Key} 


defines a homomorphism 7 of the (q, n) homology group of 2) into the (q, )- 
homology group of 2. The nucleus of 7 is the zero class of the (q, m)-homology 
group of Zi). In fact if {K¢,)}q~ belongs to the nucleus, then there is a chain 
K"*?™ of > such that 


(3.6) "  BPaKtenty = Kr, 





(q, n)- 
in this 


q>n. 


n 


up B; 
group 
* and 


multi- 
vill be 
are all 
n of a 


id de- 
chain 
e (for 


corre- 


n 
(bo 
ng 7: 


1; n)- 
nlogy 
shain 








9° 


A NEW HOMOLOGY THEORY 373 


If Kn,” * denotes the regular part of K"*?~ | it follows then from (3.1), (3.2) 


(r) 


and (3.3) that 

(3.7) FP(Koy” *) = Kip. 

Thus 7 is univalent (= an isomorphism with a subgroup). To prove (3.4) we 
merely need to show that 7 is a mapping “onto,”’ i.e. that every class of the 
(q, n)-homology group of 2 contains a regular cycle. Let {K"} be such a class. 


We may suppose that K” is not regular, so that some vertex P appears in K" 
with a multiplicity > p — 1. The chain K” may then be written 


(3.8) K" = Ki + K2, 


where 
(3.9) Kr =R°+PR™*+---+P?"R™?", Kz = P’s””, 


and the chains R‘ do not have the vertex P. Since no (n — q)-simplex can 
belong to the g-boundaries of both K;' and Kz it follows from F*(K") = 0” * 


that 
(3.10) F°(Ki) = F*(K2) = 0” %. 


The (q, n)-cycle Ke whose dimension n is greater than p — 1 lies in the closure 
of the star St P of P. Hence (Appendix I) Kz belongs to the zero class of the 
(q, n)-homology group of = and hence, by (3.8), Ki belongs to the homology 
class of K”. Thus the removal of those simplexes of a cycle K" which contain 
a vertex P with a multiplicity greater than p — 1 does not alter the homology 
class of K". Obviously K(¢,) is obtained from K” by repetition of this process. 
Hence the regular (q, n)-cycle K(,) belongs to the homology class of K". This 
completes the proof of (3.4). 


4. Invariance under Subdivision 


We shall consider here the effect of a certain elementary subdivision of the 
simplicial system 2 with respect to the new homology groups. Let (ab) be a 
l-simplex of the not-extended system 2 and y the “midpoint’’ of (ab). We 
form a new simplicial system, 2(ab), which has all the simple simplexes of = 
except those which contain both of the vertices a and b. A simplex A(ab) where 
Ais a simplex free from a and b and may be vacuous, is replaced by the simplexes 


(4.1) A(ay), Alby), A, 


where the first two have the same dimension as A(ab) and the last has a dimen- 
sion lower by one. The simple simplexes of 2 which do not contain the face 
(ab) the simplexes (4.1) and their generalized simplexes, constitute the simplicial 
system Z(ab). 

In the construction of >(ab) from > we limited ourselves to simple simplexes 
because every simplicial system is determined by its simple simplexes. 








ET Ae ta apt oo gee ce? elt 


" Ree ae 


374 W. MAYER 


(4.2) THEorEM: The simplicial systems = and X(ab) have isomorphic (q, n)- 


homology groups. 
First we construct a subsystem 2* of Z(ab) which is isomorphic to =. The 


n-chains *C” of =* are generated by n-chains of 2(ab) of the form 

(4.3) A(a’y* — y'™* + 7b"), 

where v, » = 0, 1, 2, --- are non-negative integers and A are simplexes of (ab) 
free from the vertices a, b and y and may be vacuous. It is easy to verify that 


F(*c"™) ¢ *C" such that * is indeed a subsystem (not with a simplicial basis) 

of =(ab). Furthermore, by the 1:1 correspondence 

(4.4) A(a’y* — 7" + 7’b") © A(a’b") 

between the bases of =* and 2 we establish isomorphisms between the groups 

of the n-chains *C” and C"(Z) for n = 0, 1, --- , which for n = 1, 2, --- com- 

mute with the boundary operator F, thus showing the isomorphism of the 

systems ~ and =*. We prove this last statement for the basic-chains (4.4). 
It is trivial when v = 0 or » = 0. We therefore assume that both »v and u 

are = 1. Then (from the rule of Appendix I for taking boundaries of product 

chains) 


F[A(a’b")] = (a’b")F(A) + Alva” b* + ya’b*] 
(4.5) < (a’y" — 7 ™ + b*)F(A) + Alv(ay* + 7 +770) 


( + ula’ =o + yb) = FLACa’y* — 7° + 7’0')) 


Hence = and >* are isomorphic. 
Now we define a homomorphism of C”(2(ab)) into *C” by the mapping 


Aa’y" — Aa’ 


| Ay’b" = A(a’y" += 7’ +. 7’b"), 
of the basic chains (simplexes) of C”(Z(ab)) into basic chains of *C”. As before, 


A denotes simplexes free from the vertices a, b and y and v and up are 2 zero. 
This homomorphism also commutes with the boundary operator F. In fact 


| F[Aa’y"] = a’y"F(A) + Alva” *y" + ya’y*”] 
— a’™F(A) + Al(v + wa”) = F(Aa’™), 


(4.6) 


(4.7) 


and 
F[Ay’b"] = y"b"F(A) + Alpy” "db" + py'b*”] 

aan > (ay — 7 ™ + Yd'YF(A) + Alv(a™ ty" — 7 + YP) 
[ + p(a’y"* — 7 + yb) = FlA(a’y" — 7°™ + 7b’). 


The homomorphism (4.6) therefore maps (q, n)-cycles into (q, n)-cycles and 
(p — q) boundaries into (p — q)** boundaries. In particular zt determines a 








(q, n)- 


The 


=(ab) 
y that 
basis) 


Troups 
com- 


f the 


and yu 
oduct 


1p") 
p") |. 


fore, 
zero. 


lye) 


and 
Les a 





A NEW HOMOLOGY THEORY 375 


homomorphism of the (q, )-homology groups which we now show to be an iso- 
morphism. 

The homomorphism (4.6) leaves the chains of *C” unaltered. We show this 
also for the basic chains (4.3) of *C". In fact we have here 


A(a’y" — ie + y7'b") > Ala’** -—aq”™ + a’y" — ie + y'b"]. 


Now let K" be a (q, 2)-cycle of *C” and let {K"}* and {K”} denote its homology 
classes in *H}? and H7(Z(ab)) respectively. Under the homomorphism of the 
homology groups defined by (4.6) the image of {K"} is {K"}*. Thus the image 
of this homomorphism is the (g, n)-homology group *H? of >* itself. 

Suppose now that K” is a (q, n)-cycle of C"(Z(ab)) which is mapped by the 
homomorphism (4.6) into the zero class of *H7 , i.e. the image cycle *K” of K” 
is a (p — q)** boundary of a chain of *C"*”*. By (4.6) the (q, n)-cycle 
K" — *K" is a linear combination of basis chains of the form 


(4.9) A(a’y* —a’™), = A(a’y* — 7"™). 


The sum of the coefficients of K” — *K” is therefore zero. 

Furthermore, since the chains (4.9) lie in the closure of St a of 2(ab), it follows 
[Appendix I] that the cycle K” — *K” lies in the zero class of its homology 
group, i.e. isa (p — q)** boundary. Since *K" is a (p — q)** boundary, it follows 
that K" itself is a (p — q)* boundary of a chain of C"(Z(ab)). 

Thus the homomorphism of H7j(Z(ab)) onto *H? is univalent and therefore 
an isomorphism. Hence, referring to the isomorphism (4.4), the simplicial 
systems 2 and =(ab) have isomorphic (q, n)-homology groups. This isomor- 
phism, which is a combination of the isomorphisms (4.6) and (4.4), is obviously 
generated by the simplicial mapping of 2(ab) into = in which y is mapped into a 
and the other vertices remain unaltered. Collecting the results, we have: 

The simplicial mapping of =(ab) into 2, in which y is mapped into a (or b) 
and the other vertices remain unaltered, generates an isomorphism of the (q, ”)- 
homology groups. 


5. The new homology groups for topological spaces 


The passage from finite simplicial systems to arbitrary topological spaces is 
similar to the procedure utilized by Cech for ordinary homology groups. We 
shall suppose the reader familiar with that method, and for details refer him 
to Cech’s initial paper: Théorie générale de l’homologie dans wn espace quelconque, 
Fundam. Mat. 19 (1932), 149-183, or to the full exposition of the theory in the 
forthcoming book by Lefschetz, Algebraic Topology, in the Colloquium Series, 
Chap. VII, §1. This book will be referred to in the sequel as “L,” and its 
general terminology will be utilized in the present section. In particular, here 
also a topological space designates a space which satisfies all but the separation 
axioms for Hausdorff spaces (L, Ch. I, No. 6). 

The general argument in the Cech theory runs as follows: Let {U,} be the 
finite open coverings of a topological space St, & the nerve of U,, and if Uy 


























































eyes 










Trae age Sey es 


376 W. MAYER 


refines U, let be a projection by inclusion ® — ®, (see L, Ch. VII, 1.3). 
The projections 7. for given \, », induce a unique simultaneous homomorphism 
#. of the homology groups Hx’ of &, into the corresponding groups of ®, thus 
giving rise to inverse systems: S” = {HX ; #.}, and the corresponding groups 
of R are defined as H” = lim S". 

The same argument may be repeated for the new homology groups. The 
only step which is new is the explicit proof that #} is unique. As in (L, Ch, 
VII, 1.4) any two projections 7, 7,’ are shown to be “prismatically related” 
(in the sense of L, Ch. IV, 16.2). The explicit deduction of the uniqueness of 
#, from this property is given in Appendix II. As a consequence we shall have 
here also the inverse system S} = {H7 (da); #.} and define for R: H q = lim S}. 


6. Application to finite polyhedra 


Let | = | be a finite simplicial polyhedron, where ~ is a finite Euclidean complex 
in the sense of (L, Ch. III, 6.9). We have thus the topological groups H7(| = }) 
of the space | Z| in the sense just defined, and also the combinatorial groups 
H7(2) of the simplicial system 2. The proof of the isomorphism of the two is 
carried out asin (L, Ch. VIII, 10). The only modification made is the following: 
Instead of taking the successive barycentric subdivisions we choose successive 
subdivisions 2, 2’, 2”, --- =", --- where =""' is deduced from =" by introducing 
the ‘“‘midpoint”’ y of one of the largest of its one-simplexes (ab). 

In the notation of §4 we have ="** = ="(ab). We have seen in $4 that for 
the simplicial projection 


+1 +1 
en"; o°* — 3, 


which leaves unaltered all vertices of ="** but y, which vertex is mapped in one 
of the vertices a or b, isomorphisms between the new homology groups result. 
To complete the parallel with the treatment loc. cit. it suffices to observe that 
the maximal diameter of =" has a length converging to zero for n > ~. 
This follows from the fact that in constructing ="*’ from =" we choose the 
midpoint of one of the largest of the one-simplexes of 5”. 
We have then the analogue of (L, Ch. VIII, 10.1): 
(6.1) THrorem. The groups Hj} (2) of the simplicial system = are isomorphic 
with the corresponding groups of the polyhedron | =|. Therefore the H7(2) are 
topological invariants of |=|. That is to say, if two polyhedra | =|, | X:! are 
homeomorphic, the corresponding (combinatorial) groups H? are the same. 


7. Appendix I 


Let K” be an n-dimensional chain of the simplicial system =, P a vertex of 2 
and St P, the star of P in >. 
(7.1) THrorem: Ifn ¥ q — 1 then every (q, n)-cycle K” which lies in the closure 
of St P is a (p — q) boundary; if n = q — 1 then a (q, n)-cycle K” which lies 
in the closure of St P is a (p — q)* boundary if and only if the sum of its coeff- 
cients is zero (mod p). 


















[, 1.3). 
rphism 
» thus 
groups 


. The 
L, Ch. 
lated” 
ness of 
il have 
mS). 


mm plex 
(| 2 |) 
rrOUps 
two is 
ywing: 
essive 
lucing 


at for 


in one 
result. 
e that 


se the 
orphic 


>) are 
,| are 


< of 2 


losure 
*h lies 


coeffi- 





“I 


A NEW HOMOLOGY THEORY - 37 


Before proving this theorem we define the product-chain of the chains 


. X 
Ki = x ay Pin; K> - f O Cor P2r 
tT 


by 
(7.2) Ki Kk = pe Ay Gar Pir Pr, 
where pi,p2, is the (v + » + 1)-dimensional simplex containing the vertices of 


both simplexes pi, and ps, , if and only if the product chain (7.2) belongs to 2. 
If u and » are positive integers, then F(K;) and F(K%) are defined and 


(7.3) F(KiK:) = KsF(Ki) + KiF(K®). 


We can extend this rule to cover the cases of »v S 0 or w S O (ie. chains of 
zero or “negative dimensions”’)* by defining 


(7.4) F(P*) = 1, P® a zero-dimensional simplex, 
and 
(7.5) F(1) = 0, F(0) = 0. 


Remark: If g > n then F*(K”) = 0 no longer characterizes the (q, n)-cycles 
(cf. §2). Because then every n-dimensional chain is a gq-cycle whereas 
F°(K") = 0 is not always true if gq = n + 1. 

We now return to the proof of (7.1). If K” is an n-dimensional chain which 
lies in the closure of St P, we define for it the (linear) operators D,, v = 1, 2, 
+++, p, by the formula 


(7.6) DAK") = (—1)""(» — 1)!P”K", y=1,---,p. 


In this formula 0! and P° are to be replaced, as usual, by 1. By (7.3) and (7.6) 
we have, for y = 1, 2,---,p —1 


(7.7) FD,(K") = (—1)’v!P?""K" + (—1)""(» — 1)!P” F(K"). 


(7.8) (FD, — D,F)K” = D,43K", y= l,---,p—l. 
Thus, in terms of operators 
(7.9) FD, — DF = Diu, yv=1,-++,p—1. 


ee 


‘In this section and the next we shall make use for formal reasons of chains of negative 
dimensions. 

To the chain-groups C*, n = 0, 1,2, --- of §2 we add the chain-groups C™, n = 1, 2, --- of 
chains of negative dimensions. C1 (by definition) is the group of the rest-classes modulo p 
and the groups C-", n = 2, 3, --- consist of the zero-element only. 

(7.4) and (7.5) define the homorphisms C* > C-!, C-! > C-?, --- ._ The relation (7.3) 
then is valid for any » and u = 0, +1, +2, --: . 








a nn eee ee a ee 


378 W. MAYER 


By eliminating D., D3, --- , D, from the first v of the formulae (7.9), we find 
(7.10) Diy = (—1) (") F’*D, F’. 

Thus, for vy = 1, 2,---,p — 1 we have 

(7.11) ‘z (— 1(%) FD, F K" = (—1)’»! P?* K". 

In particular, when vy = p — 1, 


p—l : a : ; 

(7.12) > (—1)’ 4 j ') F?-7" Dy Fh K” = (p — 1)! K”. 
j=0 

But p is a prime number so that (p — 1)! = —1 and (? j ') = (—1)’, (mod p), 

j =0,1,---,p-—1. Hence, for every chain K” in the closure of St P, 


F?" D,K” + F?? Di FK" + eee oe FD, F??K" + D, F?" K" 


(7.13) | -_-K 
Now let K” be a (q, ”)-cycle in the closure of St P. If qSnorg>n+l 
then F*(K”) = 0 by (2.1), (7.4) and (7.5). Hence, by (7.13), if¢g #n+1 
—K" = F” "DK" + +++ + F?“*D,F*"K" 
(7.14) = F’“*(F*"D,K” + -+- + DiF*"K’). 


This proves the first assertion of (7.1). 
If g = n + 1 then by (7.4), F“(K") = F"*"(K") is the sum of the coefficients 
of F"(K"). Suppose that 


(7.15) K* = )) ae(Pe, °°* Pass): 

Then, by (1.6), 

(7.16) F\(K") = mt Dida Do (Pay *** Pass )ine**in 
21°°* In 


So that the sum of the coefficients of F"(K") is (n + 1)!}o a2. Butn+1= 
q < p, hence F""*(K") is zero if and only if )> ag = 0 (mod 7). If this is the 
case, then (7.14) holds and K” is a (p — g)** boundary. On the other hand if 
K”" is a [p — (n + 1)] boundary 


(7.17) K* = F?-*"*(K?), 
and if 
(7.18) K?* = D1 ae(Pa, -** Pa,) 


then, (1.6), 
(7.19) K"=(p—n—-1)! Dia. Do (Pay +++ Pap)ige+rip-a-t 


a *ip—n—-1) 


has as the sum of coefficients 





an 


Si 
to 
ca 








re find 


od p), 


r+] 
+ 1 


ients 


s the 
nd if 


(7.20) 


A NEW HOMOLOGY THEORY 379 


—n-—1 


@ -n-I1Da(, ? ) 


and this sum is zero modulo p. This completes the proof of (7.1). 


8. Appendix II 


Let 5 and = be simplicial systems and let f and g be simplicial mappings of > 
into S. Let A,, v = 1, 2,--- , N, denote the vertices of = and let A, = f(A,) 
and A, = g(A,). 

(8.1) TxHrorem: If, for every simplex (A, ---A,) of 3 the simplex (Ai --- 
4,Ai «+: An) belongs to Z, then f and g determine the same homomorphism of 
Hj (3) into Hy (2). 

In proving this theorem we first order the vertices of = so that the vertices of 
every simple simplex receive a certain definite ordering. If the simplex 
(4,---A,) is not simple, then i < & implies either that A; = A; or that A; 
precedes A, in the given ordering of the vertices of 5. 

We now define p linear operators Dy , D: , --- , Dp_1 , each one of which maps 
chains of 5 into chains of 2. It is sufficient to define these operators for the 
simplexes of and this is done by the formulae 


(8.2) 


Since we have to apply these operators also to boundaries of chains, we have 
to extend the above definition to chains K of ‘‘negative dimensions,” in which 
case D,(K) shall be zero. 

Ifr < p — 1 we have 


DAA, +++ An) = (ay rif (Ay ++ Avy As Ay” Ages *** An) 
i=l 


| — 2 (Ay +++ Aga Ay?” Aja As). 


EDA, --» A,) =(-1) (r+ Dd)! P (Ay +++ Ag Ag? +> As) 


_Bdy dead A] 








(8.3) 7 t=1 
- (ari © yD (Ay +++ AsAg? 7 +++ An)j 
us pape vo AeA + AD]. 
Furthermore 
(8.4) > DA --» A,); = D> (As ++ An); = D,F(Ai +++ An). 











ee ee ne ro: 


* NOE etc organi oo 
inte El Bo RN 


380 W. MAYER 


Hence from (8.2), (8.4) 


D,F(A, «++ A,) = (—1)'r! }> Dy (Ar ves AAP oe AQ); 
(8.5) eats 


i oss esl... A.) 


j=l ix] 
This combined with (8.3) gives 
(8.6) (FD, — D,F)(Ai +++ An) = Dy4i(Ai «++ An), r<p-l, 


which formula also holds for chains of dimension < 0. Hence, just as in 
Appendix I, 


(8.7) D.=> -'(5) FD, F?, r<p. 


7-0 
In particular, when r = p — 1 (since (p — 1)! = —1, e j ') = (—1)’ (mod p)) 


we arrive at 


p—l : : = , ’ 
(= F?-** Dy PA, +++ A,) = — > (Ai +++ As Assy +++ An) 


(3.3) \ ; 
+ 2 (Ar +++ Ava AG +++ An) = (Al +++ An) — (Aa ++ ad. 
Hence for any (n — 1)-dimensional chain K"" of 5 we have 


p—l ” 
(8.9) (= F?-7" D, F’) Kr = rt K™ 
j=0 


where K”* = f(K"”) and K’"* = g(K"”). 
Now let K"* be a q-cycle, then 


(8.10) DF°K"", --- , DoF? "K" 
are all zero since either the F*K"", --- , F’-"K"", are zero or they are of “nega- 
tive dimension.” 
Hence 
q—1 
(8.11) K'" re K* 7 ri Fo Do F’) Ro], 
j=0 


thus K’"" — K"" is a (p — g)-boundary. 

Hence K’"* and K”" determine the same element of H*(2). This com- 
pletes the proof of (8.1). 

Remark: The above operations D, generalize in obvious manner the so-called 
“homotopy-operator” D of (L, Ch. IV, No. 14). 


INSTITUTE FOR ADVANCED Stupy 


ki 
Wi 


If 


fol 








0) 


tek” 


<p 


dl p)) 


1ega- 


ulled 





AynaLs OF MATHEMATICS 
Vol. 43, No. 2, April, 1942 


ON THE DIFFERENTIAL EQUATIONS OF THE SIMPLEST 
BOUNDARY-LAYER PROBLEMS 


By HERMANN WEYL 
(Received February 10, 1942) 


1. The central boundary-value problem and its hydrodynamic interpretation 


In the theory of viscous fluids the following non-linear boundary-value prob- 
lem for a function w(z) of a real variable z = 0 involving two constants k > 0 
and \ = O plays an important part: 


w’'” + 2ww” + 2(k? — w”) =0 for z= 0; 


™ (w(0) = w'(0) = 0, w'@)>k for z> 


We consider \ as a given constant, but k as a variable parameter. A mathema- 
tically satisfactory proof of its solvability has never been given, although 
various numerical devices, including V. Bush’s differential analyzer, have been 
set at work on it. We shall here give a complete solution of the problem," first 
for the two special values \ = 0 and \ = 3 by a process of alternating approxima- 
tions, rapidly converging and thus well suited for numerical computations 
(§$§2, 4, 5), and then approach the general case (§§6, 7) by the method of fixed 
points of transformations in a functional space,—which is considerably less 
amenable to calculation. In between (§3) the first method will be applied to 
certain boundary-value problems closely related to (Ao). 

There are available two hydrodynamic interpretations of (A,). Consider 
first the steady flow of an incompressible viscous fluid of constant density p and 
kinematic viscosity € filling the half z > 0 of an m-dimensional Euclidean space 
with the Cartesian coordinates 2, +--+ , Xm and the cylindrical coordinates 


r= (ai + +++ + an)’, Z2= Im. 


If cylindrical symmetry prevails and hence the radial (r) and vertical (z) com- 
ponents u, v of velocity as well as the pressure pp depend on r, z only, then the 
following differential equations obtain for z > 0: 


r 





ou, m—2 ov 
—2 
(1) J 4,0 4 ou 7 = (au —™ ) 
of thet ae ‘— led 
a a 





“or dz Oz 





"See the author’s preliminary notes in Proc. Nat. Acad. Sci. 27, 1941, pp. 578-583, and 
28,1942, pp. 100-102. 
381 











Te Page ss PG OT Pare 


382 HERMANN WEYL 


where the Laplace operator A is defined by 


a” a ¢ — oe d’¢ 
ae sai or zt’ 


These equations are to be combined with the boundary conditions 
u— 0, v—0 for z—0. 


For an ideal fluid, « = 0, we have this simple solution: 


- i” v(r, 2) = —2kz, 





u(r, z) = 


2 
po = const. — 3(up + v) = const. — a + ‘ 


arising from the harmonic velocity potential 


m— —— 


and involving an arbitrary positive constant k. As is necessary, the vertical 
(though not the radial) component velocity vanishes along the boundary z = 0. 
The Navier-Stokes equations (1) for the viscous fluid possess a solution of the 


form 
:) + ua) 


which approaches the solution uw , v0, po forz— ©. The first equation (1) is 
identically satisfied, the second and third yield 
2 


éF'” + 2FF” + —~— ( — F”) =0 
m—l1 











u = 





J TF'@), v= —2F(z), p= const. — ait (— > 


and 
(2) kL! = 2FF' + €F" 


respectively. Setting F(ez) = e-w(z) we obtain the equations (A)) with \ = 
1/(m — 1) from which the viscosity constant has disappeared, so that w is 
independent of e«. Equation (2) in integrated form gives 


k’-L(ez) = €{w'(z) + w*(z)}. 


Our solution describes approximately the flow of a viscous fluid around an 
obstacle with a blunt nose in the neighborhood of the forward stagnation point. 
The cases of physical interest, m = 2 and 3, i.e. \ = 1 and 3, have been treated 
by Hiemenz and Homann respectively.’ 





*Hiemenz, Dingler’s Polytech. Jour. 326, 1911, pp. 321-326. F. Homann, Zeitschr. 
angew. Math. Mech. 16, 1936, p. 153. 


[As 
dif 
Pr 
eX] 


wh 


We 
aris 





rtical 
= 0. 
f the 


1 an 
pint. 
ated 


schr. 








DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 383 


Let the subscript €in Ue , Ve , Pe indicate dependence of our flow on the viscosity 
constant «. Certainly ue, ve, De tend to uw, %, po with e — O in the region 
z > 0, but the convergence cannot be uniform at the boundary because the 
viscous fluid adheres, the ideal glides along the wall. Hence we have the 
phenomenon of a boundary layer of thickness ~e in which the velocity rises 
from 0 at the surface to the external value 


= i” d = v7, 0) = 0. 





a = u(r, 0) = 
Indeed 
(3) ur, €2), =v. €z), pr, ez) 
tend with e — 0 to the values 


rw(z), V(r, 2) = —2w(z), 





2 
a dap | 





2 
p(r) = po(r, 0) = const. — o(-* :) . 


[As a matter of fact, the first two quantities (3) are independent of ¢, the last 
differs from j(r) by the term 2e'{w’(z) + w'(z)} of order ¢.] According to L. 
Prandtl, similar circumstances with regard to convergence for « — 0 are to be 
expected along the surface of any obstacle immersed in a fluid of viscosity &. 

We propose to formulate the two-dimensional boundary layer problem in 
terms of conformal coordinates & , £2 which arise from the Cartesian coordinates 
n!, 2 by a conformal transformation. Let w, v2 be the covariant components 
of velocity with respect to these coordinates & , & and 


ds’ = dxj + dx: = e(déi + dés) 


the square of the line element. The Navier-Stokes equations assume the form 


OU OU2 _ 
4 Our 
(4) xa + ’ 

To UE — 4 :> Deu‘ 24 OP 
F 
(5) be at i ou ey WET HA 
Uk Ui e 
h ae "7 1 (3 - 3) ee 
where 
Au = + Fa 
= ag 


We suppose that uw, we, p with e — 0 converge to the flow So of an ideal fluid 
arising from a harmonic potential ¢: 


‘ 1 : 
= 0¢/0E;, Po = const. — +>: Ug = const. — Je z ue 











Dalim 


Cee ent nie 28 


PTR ge Soa at pagina ce pce 
~~ 4 


ae ae 


384 HERMANN WEYL 


Along with ¢ any multiple kg with a positive constant factor k is equally service- 
able, which means that the total strength of the stream may be arbitrarily fixed, 
Choose £; , & so that a multiple kf of £ = & + 2 is the complex potential of the 
limiting flow Sf and let the stream line & = 0 be the one which forms the bound- 
ary. We have good reasons to believe, and this belief is the basis of the 
boundary-layer theory, that wu; , u2/e and p, when expressed in terms of the 
arguments § = & , 7 = &/e, tend to limiting functions U(é, n), V(é, n), P(é, n) 
which satisfy the equations arising from (4), (5) by the same passage to the 
limit. The second equation (5) then shows that dP/dn = 0, and P(E, n) is 
therefore independent of » and has the value 


p(t) = pol, 0) = const. —k*/2e(é), —&(é) = e(E, 0), 
throughout the boundary layer. Thereafter the two other equations give 


dU , OV _ 
(B) ea” 
aU aU 2 72, OU 
Uy tV 5, + MOE U) = FR 
where 
_ 1d log ae) 


One has to add the boundary conditions 
(B) U-0, V-O0O for 7-0 and U—-k for n>. 


A full justification of the basic hypothesis of boundary-layer theory will hardly 
be possible without changing its differential form as given by these equations 
into a suitable integral form and without proving the existence of a unique solu- 
tion of the problem (B, B).’ 

Because of the first equation (B), the flow (U, V) derives from a stream 
function y, 


U = dp/dn, V = —dy/dé 


satisfying the formidable differential equation 


2_ (ap\’ ey wy Fy day _ ay 
ae Mae — (Z)} + gy — SH = BS 





on On? 


and the boundary conditions 


(By) y— 0, a for _»—0, F uh for »—~. 
on dn 





’ Experience shows that in general the assumptions of the theory are fulfilled only along 
a certain frontal part of the surface of the solid. For the whole theory see 8. Goldstein, 
Modern Developments in Fluid Dynamics, vol. I, Oxford, 1938. 


ve 


re 








ervice- 
’ fixed. 
of the 
ound- 
of the 
of the 
P(E, 1») 
to the 
» 0) is 


$ give 


ardly 
ations 
» solu- 


tream 


along 
Istein, 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 385 


Suppose now the obstacle is an angle of A (0 S A < 2) with the origin as 
vertex and the positive real axis as median. The exterior of the angle is mapped 
conformally upon the slit ( + 7 )-plane, the slit extending along the positive 
real axis, by the analytic function 


a + ix, = const. (& + i&)'™, 


and thus one readily finds 
(6) A(t) = —Z-. 


The domain for the differential equation (By) is the quadrant — > 0, » > 0. 
If, more generally, the solid parts the stream symmetrically with a prow of 
angle 7 at the origin, then the formula (6) will hold at least approximately in 
the neighborhood of the forward stagnation point. The problem (B,, By) 
with this value of h(€) is carried into itself by the transformation 


(7) E>7& sre bore 
(y a positive constant). Hence the solution must be of the form 


(8) W(E, 0) = 2/E-w(n/2V 2), 


and for the function w(z) one obtains exactly the conditions (A,). The case 
\ = 0 where the obstacle consists of the half line y = 0, x = 0 in the z, y-plane 
and the fluid flows by with constant positive velocity k was the first boundary- 
layer problem to be numerically integrated (Blasius 1907). Arbitrary values 
of \ have been treated by V. M. Falkner, S. W. Skan and D. R. Hartree.* Of 
the two hydrodynamic interpretations for (A,) which we have described, the 
second is applicable to all values of \ (at least within the range 0 S \ < 2), the 
first to the reciprocal integers \ = 1/(m — 1) only. Both coincide for \ = 1. 
We notice in particular that the two-dimensional boundary-layer problem of 
the rectangular prow is mathematically equivalent to the three-dimensional 
flow against a straight wall (A = 4). 


2. Solution of Blasius’s Problem 


Turning to the solution of our problems, we start with the case \ = 0, which 
occupies a singular position inasmuch as the parameter k is absent from its 
differential equation 


(9) w'’ + 2ww” = 0. 





‘H. Blasius, Zeitschr. Math. Phys. 56, 1908, p.1. V.M. Falkner and S. W. Skan, Phil. 
Mag. 12, 1931, p. 865; (British) Aero. Res. Comm. R. & M. 1314; D. R. Hartree, Proc. 
Camb. Phil. Soc. 33, 1937, pp. 223-239. 








NET AME sera se aparece Ds 


nana hate ae nae 


tien AS 


386 HERMANN WEYL 


For any constant « the expression «x-w(«z) is a solution of this equation if w(z)tis, 
Following an argument first advanced by Tépfer’ let w = f(z) be the solution 
determined by the initial values 


fo =fO=9, f"O)=1 
Once we are sure that f extends over the whole interval 0 S z < ~ and f' tends 
to a positive limit 6 with z — ~, 

— | f(z) dz > 0, 
0 

we may adjust the constant « so as to let the derivative of w = x-f(xz) approach 
k at infinity: 
(10) KB=k, «= (k/8)’. 
Therefore 
(11) w’(0) = «° = ak!, a= 64, 


The value w’’(0) is the essential factor in the formula for the skin friction along 
the immersed plate. Hence skin friction is proportional to the 3/2 power of 
velocity. 

Treat f and f” in the equation 


4 + of ee 


as two separate functions. Because of the initial condition f’’(0) = 1 one then 
obtains 


f"(@) = exp (-2 [ 1) ar). 


Introduce f’” = g as the unknown function and using the initial values f(0) = 
f'(0) = 0, tie up f, or rather its integral, with g by two successive partial inte- 
grations: 


2 l “f(g dt = I “(2 = gP"(o) ak. 


The differential equation plus the initial conditions are thus equivalent to the 
integral equation 


(12) g = P{g} 
with the operator 


Sig} = exp (- [@- 0) ar). 





5 Zeitschr. Math. Phys. 60, 1912, pp. 397-398. 








DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 387 


w(z)‘is, Notice the following properties of this operator: 

solution (i) S{g} 20, (ii) fg} = HI g*} ifg Sg*. 
We are led to define a sequence of successive “approximations” g, , starting 
with go(z) = 0, by 

f’ tends Gna = {gn} (n = 0,1, 2,---). 
The trivial relations g. = go = 0, ge = go = O, implied in (i), give rise by (ii) 
to two rows of inequalities, namely 


(13) Yn, N29, Glgs, Gs2=g,-': 
proach and 
(14) Sg, Neg, gg, gs 29s,°°° 


The latter may be rearranged as follows: 
OSgSgS-e+ and m2g,2>g=°--. 


In view of (13) these relations prove that the descending sequence of the odd g,, 
lies above the ascending sequence of the even g,. Does this “alternating 


—_ pincer movement” close in on a uniquely determined limit function g(z)? 
To answer this question, introduce the abbreviation 
(15) Ge) = [ @ - ros ax 
and let 
_— 0S gz) Sg*(2), Ag=g*—g, Aig} = B{g*} — Pig}. 
Since 
Os¢e*-—e°Sv—-u fOsusv 
we get 
(0) = 0 S —A®{g} < AG. 
| inte- The increment AG arises from 2-Ag by thrice integrating from 0 to z. These 
remarks suffice to establish the inequality 
(16) | gn4i(2) — gn(2) | S (22*)"/(3n)!. 


Indeed, because go = 0, g1 = 1, it holds for n = 0, and since threefold integration 
to the changes 





z°"/(3n)! into 2°"**/(3n + 3)! 


the inequality carries over from n to n + 1, i.e. from gn41 — gn to®{gn4i} — O{gn}. 
Thus convergence (of the type of the exponential series) is assured by the rela- 
tion (16), and we obtain a solution 


g(z) = lim g,(z) 


no 


of (12) which is larger than the even and smaller than the odd g, . 

















388 HERMANN WEYL 


Uniqueness is established by the remark that any solution g(z) of (12) satisfies 
the inequalities 


g = go, gin, g = g2, g9 SQs,°°° 


derived from the trivial one g 2 go = 0 by iterated application of the operator 4, 
Thus g is necessarily caught between the tongs of the even and the odd g, . 

Our next concern is the asymptotic behavior of g(z) for z— 0. Choose any 
z > O and set 


[ ” q(t) dt = ¢(>0). 


As G arises by two-fold integration from 2 I go(f) df we get G2(z) = c(z — ) 
and thus 
g(z) S$ gs(z) Se” forz = %. 


Consequently 


[ sea = 8 >0, [ eale)az = 6" > 0 


converge, the asymptotic behavior of 


fe) = [ @ - daloar 
is indicated by 


(17) fle) ~ | (@ = Hols) at = Be - 8" 
and that of w(z) by 
(18) wz) ~ kz — ao 


As g(z) = go(z) = e** implies 


B2B= [ edz = 3-7 (3) 
0 3 


we find the numerical coefficient a in (11) to be < 0.684. According to the most 
reliable computations’ a = 0.664. Hence the very first approximate value for 
a which can be derived from our method misses the mark by not more than 
3 per cent. 

Given any positive constant c < B we have seen that, for sufficiently large 2, 


: (G,(z) =) Go(z) = ce’. 





6 See Tépfer, 1.c.5, and S. Goldstein, Proc. Camb. Phil. Soc. 26, 1930, pp. 19-20. 





Thm 


V 


Tm 





Satisfies 


rator , 
d gn. 
ose any 


» most 
ue for 
. than 


rge z ’ 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 389 


/ 
By making use of this fact, one can sharpen the upper bound in (16) to 


e *« (22*)"/(3n)!. 


The maximum value of this function is assumed for z = (3n/2c)', and we thus 
ascertain a constant upper bound uy, for | gn4i(z) — gn(z) | in the entire interval 
z 2 0 which is essentially of the order 


eae 


(un upm~ 18. 


Fee 

Such sharper estimates are valuable guides for numerical computation. 

Knowing a priori that the positive functions f’’, f’, f have the upper bounds 1 
z, }z respectively, one could have established the existence (and uniqueness) of 
the solution f over the entire interval 0 < z < o within the frame of the classic 
theory of differential equations. But as those bounds (and other related 
estimates) are most easily derived from the integral equation (12), I have pre- 
ferred to carry the construction through on its basis. For the general case (A)), 
\ ¥ 0, I see no other alternative. 

J. von Neumann pointed out to me that the differential equation (9) of order 3 
must be reducible to one of first order (followed by two quadratures) because it 
permits the group of transformations 


z—>z+2, w(z) —> K-w(Kz) 


involving two arbitrary constants z and x, and that thus the problem comes 
within reach of Poincaré’s discussion of first-order differential equations. 
Setting 


- dw ~2 dd 
= . ——— Sa Pie — om 23 = t 
w=e, aati 3(s), . + 
von Neumann obtains the equation 


dt _tut+0d + 2) 
dd =—S- 8 (28 — 2) 


with the initial condition t > © for? — «©. After determining ¢(#) from this 
equation, one finds by quadratures s and then z as functions of # from 


90 — t? 8 3(28 — ft)" 


3. Generalization. Power series. Goldstein’s wake problem. 
In a trivial manner our method carries over to the equation 
(19) ee 4. 2f ” s. 0 (z = 0) 





with the initial conditions 
(20) fafa =f =0, f%=l for z = 0. 








390 HERMANN WEYL 
Here v may be any positive integer. Setting” = g we get the integral equation 
Ts: y 
ue) = exp (—3 [= n'o(e)ar), 


and after defining the alternating sequence g,(z) accordingly, we find instead 
of (16) 


| gnsi(2) — gn(z) | S 2"2""/n*! [n* = (v + 1)nl, 
We may even generalize the initial conditions (20) to 
(21) f(0) 7 iy *** 5 ro = Cy-1» f°) =1 


with arbitrary constants c,. Then our integral equation reads 


ate) = exp (-20) - 3 [@ - nrotevas), 


where Q(z) is the polynomial 


(22) Qe) = Set net + ee, 


and convergence follows from the inequality 
| guss(2) — gn(z)| = A™*-2"2"*/n*![< A(2Aa™)"/n*]] 
holding in any interval 0 < z S$ ain which e **” < A. The solution g(z) satis- 
fies the inequality 
(23) 0S g(z) Suz) = 6°. 


Let us for a moment return to the simple initial conditions (20). From the 
lowest case v = 1 where the solution is an elementary function, namely f(z) = 
tanh (z), we learn that we must not expect the Taylor expansion of the solution 
f(z) around the origin to converge beyond a certain finite limit, which for » = 1 
is reached at the point z = 2/2. I find no indication in the literature that this 
had been realized in Blasius’s case y = 2. For any vy the coefficients c, of the 
power series 


f(z) = p> (~1)*c,2"" [n* = (v + 1)n] 
are determined by the recursive equations c) = 1/v!, 
n*(n* + 1) +++ (n* + ven = 2 >> (i* + 1) --- (i* + Ge, 
G@t+k=n-1). 


Following the same straightforward procedure as in my first note in the Pro- 
ceedings, we obtain 


1f{ Qt \" 1 2 . 
ee ee 








equation 


_ instead 


+ 1)n]. 


2) satis- 


om the 
f(2) = 
olution 
y=] 
at this 
of the 


+ 1)n] 


— 1). 


e Pro- 








DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 391 


and thus for the radius R of convergence the bounds 
My + 1)-(v +1)! S R™ S ¥(v + 1) ow (29 +1). 


The essential difference between the flow of a viscous fluid before and behind 
an obstacle is clearly exhibited in our problem (Ao) by the fact that no solution 
w exists if w’ is required to assume a negative value k for z > . 

However, 8. Goldstein’ has treated the wake behind a flat plate under the 
plausible hypothesis that the flow up to the abscissa x = 1 is but little modified 
if the plate, y = 0, x 2 O, ends at this point. We shift the origin of the coordi- 
nates to the end of the plate and at the same time enlarge the standard length 
at the ratio [':1, ie. in our stream function 


W(x, y) = 2n/x-w(y/2V/z) 
we make the substitution 
e=l+he, y=ben 
We then obtain for ¢ = 0: 
Y=9(n) +-2+,  o(n) = fakin, 


The remainder indicated by the dots tends to zero with 1 — © and shall be 
neglected as is permissible for plates of great length. The stream function 
v(é, n) of the “‘wake layer” behind the plate, ¢ > 0, satisfies the same differential 


equation as before 
oy, way _ ow dv _ 
On? OF On? dn dEOn 


while symmetry requires ¥(, —n) = —w/(é, 7). Hence under limitation to the 
half plane » = O the conditions at the fictitious boundary 7 = 0 become y = 
d'¥/an' = 0. We wish to construct that solution which for fixed » and § > 0 
(or for fixed — and 7 — ©) ties up with our function g(n). The problem, in- 
cluding this boundary condition, permits the substitution 


E> yt norm vor 
with an arbitrary constant y and must thus be of the form 
v(E, 0) = 38-w(n/t'). 
For w(z) one readily obtains the differential equation 
(24) w'” + 2ww” — w” = 0. 
The boundary conditions are: w = w” = 0 at z = 0, and 


(25) w''(z) > loki forz— ©, 





" Proc. Camb. Phil. Soe. 26, 1930, pp. 18-30. 





392 HERMANN WEYL 


If w(z) is a solution of (24), so is the function «-w(«z) involving an arbitrary 
constant x. Let w = f(z) be the solution of (24) with the initial values f = 0, 
f' =1,f"” = 0forz =0. Then f’’(0) = 1 while differentiation changes (24) 
into 

(26) wl" + aww!” _ 0. 

Thus we find ourselves confronted with the case vy = 3; @ = 0,c = 1,m =0 
of the general problem (19) + (21) discussed above, and since (23) now reads 


0<f"(2) =92 se”, 


f’’(z) tends with z — © to a positive limit 


a= | gz) dz < (5: 


Consequently we may adjust the constant «* in w(z) = x*-f(x*z) so as to give 
w’(«) the desired value (25).” 


4. Solution of Homann’s problem 


Of a more difficult type is the problem (A,) for \ ¥ 0, as it involves the 
parameter k in the boundary conditions as well as in the differential equation 
itself. Here we are dealing with a real boundary-value problem, which is not 
reducible, as (Apo) is, to an initial-value problem. Only convergence of the type 
of a geometric series if any can be expected for the process of successive approxi- 
mations. By differentiating the differential equation we eliminate the par- 
ameter k: 


(27) wl’ + Qww’” + 2(1 — 2dr)w’w” = 0. 


This equation is again invariant under the transformation w(z) — x-w(xz). For 
\ = 34, the case with which we shall be concerned in the next two sections, we 
fall back upon the familiar type (26), although the boundary conditions make 
our problem considerably more intricate than before. Let f denote that solu- 
tion of (26) for which 


fO)=fQO=0, f'O=1, ff") = -6. 
It will satisfy the third-order equation 


| aed 4. off” + (8° = f”) — 0, 


and we expect that, for a certain positive 8, the derivative f” (and f’”) will 
strongly approach 0 with z—> ». Thus the equation itself forces f’ (which is 
positive throughout the interval) to approach 8, and w = x-f(xz) will solve our 
problem if «x is determined by (10). 





7A remark by K. Friedrichs to the effect that the assumptions v = 3, 2Q(z) = -# 
lead to a wake with back flow caused me to drop the restriction Q(z) = 0 for z 2 0 which 
the original MS contained. April 4, 1942. 








bitrary 
f=0, 
es (24) 


oe = 


Vv reads 


(0 give 


es the 
uation 
is not 
e type 
proxi- 
2 par- 


For 
ns, we 
make 

solu- 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 393 


As before we obtain first 


f(z) = —B’-exp (-2 [ 1) ar) 


— 8° -exp (- [ (2 — ¢)°f'"5) at) = —p oe 
0 
and then for f’” = g the equation 
ge) = 1 — Bf ag. 
0 


The constant 6° is determined by the condition g(«) = 0, thus 


B= i/[ eH ae. 


Adhering to the notation (15) we are led to introduce an operator Y which pro- 
duces from any given g(z) the function 


(28) Vig} = [  -°® ag / [ eo dg 


(novided the integral / converges) Evidently 


0 < W{g} <1, 
and as we shall presently prove, 
(29) Vig} = Vig*} if g 


The operator W is applicable to the function g(z) = 1 but not to g(z) = 0.° 
In order to solve the functional equation 


lA 


(30) g = Vig} 
we therefore construct a sequence of functions g,(z) by the recursive equation 
Gnur = Vi gn} (n = 1, 2, +++) 


starting with g:(z) = 1, in the hope that the sequence will converge to a solution 
g of (30). Alternating pincer movement of the g, is a consequence of (29) and 
the trivial inequalities go < g:, 93 S gi - 

To prove (29), set as before g* — g = Ag. The third derivative of AG = 
G* — G is 2-Ag, and AG and its first two derivatives vanish for z = 0. Hence 


* Application of the operator ¥ becomes unrestricted if one replaces the definition (28) by 


vial -3im(f"/[). 


Then one may start with go(z) = Oand find gi(z) = 1. Cf. §6. 








392 HERMANN WEYL 


If w(z) is a solution of (24), so is the function x-w(xz) involving an arbitrary 
constant x. Let w = f(z) be the solution of (24) with the initial values f = 0, 
f' =1,f" = 0forz = 0. Then f’’(0) = 1 while differentiation changes (24) 
into 

(26) wl” + Qww’” = 0. 

Thus we find ourselves confronted with the case v = 3; @ = 0,q = 1,@=0 
of the general problem (19) + (21) discussed above, and since (23) now reads 


0<f'"(2) =g(2) Se”, 


f’’(z) tends with z > ~ to a positive limit 


p* = [ oterac < /5- 


Consequently we may adjust the constant «* in w(z) = «*-f(x*z) so as to give 
w'’() the desired value (25).™ 


4. Solution of Homann’s problem 


Of a more difficult type is the problem (A,) for \ ¥ 0, as it involves the 
parameter k in the boundary conditions as well as in the differential equation 
itself. Here we are dealing with a real boundary-value problem, which is not 
reducible, as (Ao) is, to an initial-value problem. Only convergence of the type 
of a geometric series if any can be expected for the process of successive approxi- 
mations. By differentiating the differential equation we eliminate the par- 
ameter k: 


(27) w"” + Qww'” + 2(1 — 2r)w’w” = 0. 


This equation is again invariant under the transformation w(z) — x-w(xz). For 
\ = 34, the case with which we shall be concerned in the next two sections, we 
fall back upon the familiar type (26), although the boundary conditions make 
our problem considerably more intricate than before. Let f denote that solu- 
tion of (26) for which 


{O=f/O=0, f"O=1, f"O=-6. 
It will satisfy the third-order equation 


f" + ff" + (& — f") = 0, 


and we expect that, for a certain positive 8, the derivative f” (and f’”) will 
strongly approach 0 with z—> «©. Thus the equation itself forces f’ (which is 
positive throughout the interval) to approach 8, and w = «-f(xz) will solve our 
problem if «x is determined by (10). 





7A remark by K. Friedrichs to the effect that the assumptions » = 3, 2Q(z) = —2 
lead to a wake with back flow caused me to drop the restriction Q(z) = 0 for z = 0 which 
the original MS contained. April 4, 1942. 








rbitrary 
sf = 0, 
Zes (24) 


to give 


es the 
uation 
is not 
e type 
yproxi- 
e par- 


For 
ns, we 
make 
_ solu- 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 393 


As before we obtain first 


f(z) = —B’-exp (-2 [ soar) 


= —B’-exp (-| (z _— cyte) ir) - — Boe 
and then for f’” = g the equation 
gz) =1— s.[ oO de 
0 


The constant 6° is determined by the condition g(«) = 0, thus 


a ° eq) 
B 1 / e€ dt. 


Adhering to the notation (15) we are led to introduce an operator VY which pro- 
duces from any given g(z) the function 


(28) Vig} = [ * tae a /[ eo de 


(novided the integral / converges Evidently 


0 < v{g} <1, 
and as we shall presently prove, 


(29) Vig} = Vi{g*} ifg = 9*. 


The operator Y is applicable to the function g(z) = 1 but not to g(z) = 0.° 

In order to solve the functional equation 

(30) g = Vig} 

we therefore construct a sequence of functions g,(z) by the recursive equation 
nai = Vi gn} (n = 1,2, +++) 


starting with g:(z) = 1, in the hope that the sequence will converge to a solution 
g of (30). Alternating pincer movement of the g, is a consequence of (29) and 
the trivial inequalities go < gi, gs S gi - 

To prove (29), set as before g* — g = Ag. The third derivative of AG = 
G* — G is 2- Ag, and AG and its first two derivatives vanish for z = 0. Hence 


* Application of the operator ¥ becomes unrestricted if one replaces the definition (28) by 


vn =t0('/1) 


Then one may start with go(z) = 0 and find gi(z) = 1. Cf. §6. 





394 HERMANN WEYL 


Ag = 0 implies (AG)” to be an increasing function of z, and as it vanishes for 
z = O it must be positive throughout. Repeating this argument two more 
times, we find that AG(z) is an increasing positive function forz > 0. Set 


| eno) dt -_ A,, | ee) dt -_ H2, 
0 z 
so that 
Wig} = H2/(Ai + He). 
We then have 


Zz Zz 
* _ — _ -_ 
= [ en) de = [ en OD GAD ge > grAO® Fr 
0 0 


i] 2 
* _ _ - - 
nu « / oO de = H COD GBD Ge < g~SOU®), 
2 z 


or the ratios #; = Hi /H;, 3 = Hz /H2 satisfy the inequalities (8, < 1, < 1), 
3. S 3,. Consequently 


Hz /(Hi + H?) S He/(Hi + He) or Y{g*} < ¥{g}. 


Again choose a z% > 0 and set 
20 
[ glo) dg =c>0 
so that Go(z) = c(z — a)” and 


gs(z) < const. / e 5-20)? de for z = %. 


All following g’s are smaller than g; and hence the same inequality prevails for 
gn(2) (n = 3) and, provided the limit lim,z.. gn(z) = g(z) exists, also for g(z). 
Thus knowing that g(z) = f’’(z) converges strongly enough to 0 with z — 
we again get the asymptotic formula (17) and (11), (18) for the solution 


w(z) = x-f(xz),  « = (k/B) 


of our problem (A). 
In proving convergence of the alternating sequence g,(z) we use the above 
notations g(z) < g*(z), Ag, G ete., and write H = H, + H2. Then 


H, Hz 
a2 = tig) — tte"! = 7 -— 
H: — Hs 


 <omcksinde 4 ee 
~ +H (F H* 


1 p) sem tt 
a ~ = H 


—AV < [ er 8O(y — 800) gy jf [ e ® dg. 
0 








ishes for 
vO more 
Y 

set 


nw 
IIA 
—_ 
“ 
~ 


= 20. 
ils for 


r g(z). 


—_ @ 


above 


DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 395 


Suppose we have a constant » such that 


0 5 Ag(z) Su. 
Then 
1— 6° € aG(s) S wo}? 
and hence 
0S —-AV S qu 
where 


q= [ e % 428 ac / | ee) dz. 
0 0 


Another expression for the quotient q is 
| Wig} +2" dz 
0 


as one verifies by substituting (28) for V{g}. 
In this argument we can choose g = g, and g* = gn41 OF gn-1 for any even 
n = 2 and then we obtain majorizing constants y, for all odd and even n 2 1, 


(31) | gn4i(Z) — gn(z) | S on, 
which are defined by the recursive equations 


(32) Ma = 1; Mn = Qn*bn—-1, Mnti = Qn*Un (n even) 


Qn = [e nts) 4429 dz / [re i * 


= fi Gn4i(Z)*2 dz. 
0 


The second expression of gn shows that the constants g, perform a pincer move- 
ment of the same type as the functions g,(z), and hence all q, lie between q 
and q. Evaluating by partial integration the integral in the numerator of 


a= | [ata /[ 
0 

-i[ det 

3 h z-de”’, 


we find gq = 4. By some rough estimates it is proved in §5 that q@ < 0.76; 
but the value of gq is probably not much larger than gq: = 0.33. Once we are 
sure that g. < 1 we see from (32) and gn S q that the sequence g,(z) converges 


at least as strongly as a geometric series of quotient q . 


with 





namely 








Sa cee 


ie Rae 


396 HERMANN WEYL 


Uniqueness is assured since every solution g of the equation g = W{g} jg 
necessarily sandwiched in between the odd and even g, . 


5. Proof that q. < 1 


We use the constants 


A= zee dz = 37-1(2), 
0 


Sw [ e# dz = 371.7(2) = 1.288, 
0 


Their product is 


T 2r 


3 sin (r/3) 3/3" 





AB = 3T(3)P(%) = 


gz is defined as the quotient 
[ 4 e 92) 61,3 dz f [ grat dz. 
0 0 


G2(z) < Giz) = 3 


the denominator is greater than B. Let us split the integral of the numerator 


into the parts 
2 -) 
[+], 
0 2 


and employ in the first part the initial terms of the power series of G.(z), in the 
second part an asymptotic appraisal. We readily find 


Since 


(33) Gle) = 4° - BL Me - Dhar 
and, because ¢ ** < 1, 


G.(z) > 32° — —_ 2* 


and thus, as long as z S 2, 


G2(z) > c-pye* with c = 2 —1/B. 


[- [- —Go(z) -128 ae < [oe —cz4/12 a5 


(1 — e**) = 0.6574. 


Consequently 





SC 








1g} is 


rator 


n the 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 397 


To find an asymptotic estimate write (33) in the form 
, os = 1 « “J 
Bote) = [ae —  — ee ag — bP — ae at, 


The first term 
= Az’ — z+ 3B, 
the integral of the second term is changed by the substitution ¢ — ¢ + z into 


[tects ars HL Poe as 


= 62° ¢ 
Hence 
A 2 = 1 _ 1 —8/3 >2 
(34) G.(z) = B* B* + (3 j28R ° ) for ZG & 
Set 
2/AB=1/b, ie. (2b) = 27/(2r)’; 
@ poe 1 La a — 1 e 8/8 ee ae 
ms gee tO ~ ee ” 


so that the right side of (34) equals 2 + B’. Then 


© i] P 
[ < [e —(2r2+B") 032° dz 
2 2 


4(2Bb)*e-®". | e* (x + b)* dz. 


zo 


Developing 
(a + b)* = a + 32°b + 32d’ + BD 


one readily finds 


| eo (a + b) dx = fe*8(a5 + 1 + 3bao + 3b’) + 13 + 0’) / eo da, 


0 


But 


Heh [ s —~* 2 t= se 
[ie “ae = 3, Vt ~ 2ao x : 


Hence 
[ = ope e" a + 1 + 3ba + 3b° + : (+ nh 
2 


0.3171. 








398 HERMANN WEYL 


The numerator [ of gq turns out to be 
0 


< 0.6574 + 0.3171 = 0.9745 
and qe itself 
< 0.9745/B < 0.76. 


6. Set-up for arbitrary \ 

Enriched by the experience gathered in the cases \ = 0 and 4 we now make 
bold to attack (A) for arbitrary \ 2 0. We seek the solution in the form 
w(z) = x-f(xz) where 
5) tro 4 of Ww 4 2(1 =: 2x)f" 7 — (for z > 0); 

f0O=f0=0, f"O=1;) f"(~) =9, 


and introduce f” = g = g as the unknown function. Hence we start with this 
set-up: 


(36) se) = [ @ - pols)at. 
(37) \* + 2fp' + 2(1 — 2r)f’e = 0, 
(37*) 0) =1, 9(~) =0. 
(38) g=g. 


More explicitly: for an arbitrarily given function g we form (36) and then solve 
the linear boundary value problem (37) + (37*), thus defining the functional 
operator ®, carrying g into g; at the last step (38) we ask for a fixed element g 
of that operator, 


(39) g = &{g}. 


In proving the unique existence of g we shall fix the precise meaning of the 
boundary condition g(*) = 0. (The whole discussion would turn out a bit 
simpler if we dealt with a finite interval 0 S$ z S a instead.) 

AUXILIARY THEOREM. [If g(z) is any continuous non-negative function and 


re=[ oa, se = [rear = [ @ - nosras 
then (37) has a unique solution with the properties 


(40) ¢g(0) = 1; g(z) 2 0, ¢ + 2fes0 (for z 2 0). 


Proor. Set 


ple) = exp (2 [s¢)ar) 








make 
form 


IV 
S 


1 this 


solve 
ional 
nt g 


the 
bit 








DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 399 


so that p’ = 2fp and p(z) 2 1, and introduce the auxiliary function 
gi = (py)’ = ply’ + fe). 
Then 
(vi/p)’ = 9” + fo’ + f'o = AAf'e, 

and the single equation (37) for ¢ is replaced by the system 

, 1 
ge + fe = -—-a, 
(41) P 

Lei — for = 4dpf"-¢. 


It defines an infinitesimal linear transformation in two variables ¢, ¢; with the 
matrix (of vanishing trace) 


—2f, 1/p| 
| 
4rpf’, 2f | 


Hence the two solutions (y, g1) = (7, m) and (#8, 3) satisfying the initial con- 
ditions 





x | 


n=1, m=0; 8 =0, BH =1 (for z = 0) 


are given by the formula 





|" | 2 
yi = eee eee n eee n 
(42) lm, bh x | a @(z1) O(zn) dz dz 
Zins?) 


(where the term » = 0 of the series at the right is understood to be the unit 
matrix). Multiplication of (41) by ¢: , ¢ respectively, followed by addition and 
integration, establishes the fundamental relation 


(43) [veils = [ ¢ gi + 4dof #) dz. 


The fact that f’ = 0 guarantees the positive definite character of the “Dirichlet 
integral” at the right side. 
Apply (43) to #: 


oo, = | (Foi + oof'®) dz > 0 for z > 0. 
0 


This shows (1) that 8 never vanishes, therefore never changes sign and, because 
of 3:(0) = 1, stays positive throughout; and (2) that # has the same sign as #; . 
In the same manner we find that 7m , 7, m (in this order) are all positive, 


v>0, &> 0; 7>0, m>O for z > 0. 





ee eee 


a a 
ie ay 3rr 
RR eet Te tein oo 








400 HERMANN WEYL 


Next consider a finite interval 0 S z S a (a > O) and determine that solu- 
tion ¢ for which g(a) = 0, ¢i(a) = —1. Again we see from 


a 1 . 
9o1 = =| (Ce + on o) ae <0 


that ¢; is negative and ¢ positive throughout the interval 0 < z < a. In 


particular, ¢(0) > 0, so that we can divide by ¢(0) thus constructing a solution 


y” with the boundary values 


20) = 1, 9a) = 0. 
It satisfies the inequalities 
oe” >0, gs” <0 for0 Sz <a. 


Clearly » (z) is of the form n(z) — 1.-8(z) with a constant 1, for which we find 
the positive value n(a)/d(a). Let a < b and write 


ez) = o (2) + (la — b)8(2). 
Then 
le — ly = 9 (a)/8(a) > 0, 


or the positive coefficient 1, decreases with increasing a and thus tends to a 
limit 1 2 0 fora— «. The solution 


w(z) = n(z) — 1-d(2) 
is the one we wish to construct.’ It has the properties 
(44) w(0) = 1; w(z) > 0, wi(z) S 0, 
and is characterized by the fact that the condition 
g(2) = w(z) — m-d(z) 2 0 


cannot be satisfied throughout the interval 0 < z < © for any positive con- 
stant m. 

It remains to show that no solution ¢ except this w satisfies (40). Indeed, 
according to what has just been stated, any such solution would have to be of 
the form 


g(z) = w(z) + m-d(z), m> 0. 


The required inequality y: < 0 or (pg)’ S 0 implies pp < 1 for z = 0. This 


remarkable relation prevails in particular for g = w. On the other hand, 


(3:/p)’ = 4af’d = 0, 





* A similar construction in H. Weyl, Nachr. Ges. Wissensch. Gottingen, 1909, p. 39. 





th 


¢, 


j 








- solu- 


. In 
lution 


find 


to a 


ed, 


> of 


his 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 401 


therefore 
w/p2=1, w2p; 


(pd)’ = 1 2 p, poe | rg)dr ee. 


The consequent relation pp 2 m-z is incompatible with py < 1 for positive m. 
We have now completely and unambiguously defined the functional operator 
6, carrying a given g(z) 2 0 into the function w(z), w = &{g}. Since 


(45) pw £1, afortiori w <1, 


our operator & obeys the law 
0<4&{g} <1. 


Hence we can and will restrict ourselves to the set § of all continuous functions 
g(2) for which O S g S$ 1. Were & monotone in the sense that ,{g} decreases 
while g increases, then there would be some hope for successful construction of a 
fixed point of the operator , in the functional space S by some such alternating 
process of successive approximation as carried us through in the special in- 
stances 1 = O and 3. Unfortunately this does not seem to be so, and this 
calamity forces me to proceed by the general theory concerning fixed points of 
functional operators which we owe to Birkhoff-Kellogg and Schauder-Leray.”® 
The main point will be to establish ‘“equi-continuity’’ for the images w = ®{g} 
of all elements g eS and continuity for the operator #,. The lemmas in the 
following section are so conceived as to meet this demand. 


7. Solving the problem (A,) 


In the lemmas 1-4 the function g is supposed to be any element of § and 
C, Co, C1, C2, Cs are numbers not depending on g. The condition g < 1 implies 
f s«z. 

Lemma 1. 


0 < —w#’(0) Sa. 
Proor. Denote —w’(0) by 1 and argue as follows: 
(w1/p)’ = 4Af'w S Az, 
wi/p < —1 + 2d2’, 
and then, because p = 1 and w negative, 
(pw)’ = wo S w/p S —1 + 202, 


pw <1 — le + 2d". 


Celintnntnaitcaitaiiaale 

'G. D. Birkhoff and O. D. Kellogg, Trans. Am. Math. Soc. 23, 1922, pp. 96-115. 
J Schauder, Studia Math. 2, 1980, pp. 171-180. J. Leray and J. Schauder, Ann. Sc. Ec. 
Norm. Sup. 51, 1934, pp. 45-78. 











secon tae fA pag at ns 
oe :. s xe o~ 
wie o A SN gr RO ee * 


402 HERMANN WEYL 


Since w(z) > 0 we get 
t+bd=v dO or 1<1 42 
3 zZ 3 


and taking z = 1 we have proved the lemma with c; = 1 + $\. However, we 
exploit our inequality to the full when we choose c; as the minimum of the ele- 
mentary function at the right side of the last inequality, which is given by 


(46) ci = 94/2. 
One could also argue from the equation 
(pw’)’ = 2(2A — 1)f’pw. 
If \ < 3 then 
(po’)’ £0, pw’ S —l, 


* , 3 
and since p < e**’, 


w’ Ss —l-e 0Oses1- 1-[ eo de, 
0 
therefore 
1<1/B=0.777 with B= [ oe. 
0 


If, however, \ 2 3, then f’ S z, pw S 1 yield 


(pw’)’ $< 2(2\—1)z, pw’ S —1 4+ (2A —1)2 

and taking the value 

| 2-e dz = [ e#*.d(42*) = 1 

0 0 
into account, we obtain by the same argument 

lS 2n/B. 
Hence the lemma is satisfied with 
¢c: = 1/Bfor i Ss 3, ¢, = 2d/B for \ 2 3. 


For small \ and large \ our first appraisal (46) is better, but in a certain middle 
range, namely for 0.104 < A S 1.096, the second gives a sharper result. 
LEMMA 2. 


(47) 0 < —po' S a + cx’, 
and thus, a fortiori, 
(48) 0S -o’ Sat cz’. 





In 








ver, We 
the ele- 
by 


iddle 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 403 


PROOF. 


(49) pw’ = w — 2pfw S wm <0. 


(—pw’)’ = 2(1 — 2d)f’pw. 
Again distinguish the cases \ 2 3, S 3. In the first case —pw’ decreases and 


therefore 
— pw’ = l < Ci. 


In the second case 
(—pw’)’ S 2(1 — 2A)z, 
—po' S$1+ (1 — 2d)2 Sa + (1 — de. 


Thus we have established Lemma 2 with 
c = OforA 2 3, c. = 1 — 2\ford S }. 


The following two lemmas prepare the way for an asymptotic appraisal of 


g(z) when z approaches infinity. 
Lemna 3. 


(50) [ w(z) dz = c(> 0). 


Proor. Twice integrating (48) we find the inequalities 


w(z) = 1 — gz — 32°, 
[ eG) ar & 2 - derz* — ryeae! 2 21 — be/as) 
0 
forz < 1 where 
1/cz; = max, (1, c; + $c2). 


Thus 
[owas = [war = te. 
0 0 


This argument is fully exploited by choosing z = co as the point where the 
polynomial 1 — cz — 4c,2* changes sign and then computing the area 


co 
c= [ (1 — cyz — 4ee2°) dz = co(1 — 3160 — Pxceco) 
fy 
= 4¢o(3 — coc) (=> 460), 
with the result 
co 
I w(z) dz 2 c. 








OE PO ER OO ESP 





ee 


a ites 


4 
¥ 





ait Oe ee ee m= 
At A EEE No. See 





404 HERMANN WEYL 


1 
Lemma 4. If [ g(z) dz = y then 
0 


0< (ze) so, 


for z = 1. 
0 < —(w’ + fo) S (a+ oo 


Proor. The hypothesis implies for z 2 1: 
soz se)zrve-v, 2 sa 2B ve-1), 


p(z) > ered? 


Combine this with (45), (47) and (49): —w. S —pw’. 

We topologize the functional space consisting of all functions g = g(z) 
defined and continuous for z 2 0 by agreeing that a sequence g, approaches 
ZeT0, gn — 0 with n — ~, if g,(z) converges to zero uniformly in each finite 
interval; in other words, if to every « > 0, z > O one can assign an N(e, z) such 
that 0 S$ zS%,n = Ne, &) imply |g,(z)| S$ ¢«. This functional space Fis 
complete, i.e. convergence of a sequence gn , Jn — Jm 2 0 with n, m — ~, implies 
convergence to some element ¢g, g. — g ~ 0. Our domain G defined by 0 < 
g(z) S< 1 is a closed convex part of F. (See Appendix, under 1.) 

Let 6(€, z) be any positive function of the variables e > 0, z > 0. The 
element g ¢% is said to lie in %; if the inequality | g(z:) — g(z)| < « holds 
whenever 


Ofa4,2%5% and |a — z| S de, a) 


(equi-continuity of type 6). Clearly %; is a compact subset of G; one has only 
to consider the values of g for rational arguments, marching them off in Indian 
file. The same simple argument of interpolation as employed by Birkhoff and 
Kellogg, 1.c.,"° in proving their Theorem II (p. 103) yields the following general 
principle (see Appendix, under 2): 

An operator ®{g} defined and continuous in § and mapping G into Y; necessarily 
has a fixed element g = ®{g}. 

According to (48), Lemma 2, our operator ®, maps § into the subset %; corre- 
sponding to the function 


5(e, z) = €/(e, + cz’). 


Hence the existence of a solution g of the functional equation g = &{g} will be 
proved as soon as we can establish continuity of the operator 4 in 9: 

Lemma 5. The image w = &{g} depends continuously on g ¢ G. 

This fact, which we are now going to prove, is less trivial than it appears, 
because it implies that w varies but little when the “tail”? of the function g «9 
(i.e. its values for large values of the argument z) changes arbitrarily. The 
explicit formula (42) clearly shows that the particular solutions 7, m and #, 1 
depend continuously on g; these functions in a finite interval 0 S z S % do not 





= g(2) 
roaches 
h finite 
z) such 
we F is 
implies 
yos 


The 
holds 


s only 
‘ndian 
ff and 


eneral 
sarily 


corre- 


ill be 


ears, 
geG 
The 
v, Hy 
) not 








DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 405 


depend on what g(z) does forz > 2. The salient point is the way in which the 
constant 1 = —w’(Q) in 

w(z) = n(z) — 1-d(z) 
depends on g. 


Let g’” « § be a sequence which in the sense of our topology tends tog. Using 


the notations 7”, #”, I, w” in an obvious way we have 


a” > 4, nm” > m ; a 8, "5%, withn— -~. 


According to Lemma 1, all J‘” lie between 0 and ¢; and thus this sequence has 
at least one point of condensation /*. The functions 


w*(z) = n(z) — *-8(2), wi (2) = m(z) — I*-(2), 


being the limits of a subsequence of .w'”, w{”, satisfy the inequalities w* = 0, 


w, <0. However, we know that w is the only solution of this kind, and conse- 
quently /* = 1. Thus the bounded sequence ” has only one condensation 


point, namely /, and therefore converges to I. 
Our proof for the existence of a function g e § which is its own image under 
the operator & is now complete. As an image it satisfies the inequalities 


(49), (50), 
1 
(51) fs0, [ g@dz, 
0 
besides 0 S g S< 1, and as image of a g for which (51) holds, it satisfies the 
further conditions 
(52) gz) se, 
0S —(9' + 2fg) S (cr + cz’) 


forz = 1 (Lemma 4). Expressing everything in terms of f we see that f(0) = 
f'0) = 0, f’(0) = 1 and 


(53) f'"” + 2ff” — 2” = const. 
Moreover f” is monotone decreasing and f’’ as well as f’” + 2ff” tend to zero 
with z > © essentially as strongly as ¢ “. Hence the positive integral 

[ o@a = s(~) = 6 
converges and for the constant on the right side of (53) we find the value —2r8" 
so that 

f”" + Off” + 2n(6" — f%) = 0. 

An explicit appraisal of 6 is obtained from (51) and (52): 

c<B<1+V(x/2c). 














a BO hl a 
7 - ~~ 3 _T are . 
(he ROBLES ea EAR Ni pegs SER: 5s See 


406 HERMANN WEYL 


Putting finally 
w(z) = x-f(kz) with « = (k/p)! 


we may formulate our chief result as follows: 

THEOREM. For given positive k the problem (Ay) has a solution w whose deriva- 
tive is monotone increasing from 0 to k as z travels from 0 to infinity; the second 
derivative decreases monotonely from 


w''(0) = ak! (a = 64) 


to zero, approaching zero with z — © at least as strongly as a function of the type 
e™ (y > 0). 

So far so good. But I should like to see the aircraft engineer who will apply 
this method to compute the boundary layer for a given profile of an aerofoil! 

We have not proved that there is only one solution of the problem (A). 
One could try to approach the question of uniqueness by studying the continuous 
variation of the operator ® with \." It seems not impossible to attack the 
general boundary layer equation (B,) by the method here developed. 


8. Appendix 


1. The bounded continuous functions g form a normed linear space ‘fy in 
which the norm || g || induces our topology if, somewhat artificially, we define 
the norm by 


lg || = D5 { max, | (I) (v = 1, 2, -+°). 


Whereas ‘fp is incomplete, the part § is a complete closed subset of ‘. Hence 
the ‘general principle’ on which we base our argument will fit into Schauder’s 
scheme only if one slightly generalizes his central theorem (Satz 2, on p. 175 of 
Studia Mathematica 2, 1930) in the following manner. Let Sp be a normed 
linear space; a continuous mapping of a complete and closed convex subset $ 
of Fo into a compact subset S* of G has a fixed point. 

2. In adapting the proof to our conditions, I give it a more constructive twist. 
For the open interval 0 < z < © one has to combine ever more refined sub- 
division with exhaustion. Let therefore n, v be two positive integers. For 
any given numbers 


Ses (m = 0,1, --- ,nv;0 S tm S 1) 


form by linear interpolation the function g(z) = g(z; %, ++, 2m) with the 
prescribed values 


g(m/n) = tm (m = 0, 1, +++, 19) 


in the interval 0 < z S v and extrapolate it beyond v by g(z) = Xn for z 2 ». 
Denote by x,, the values of its image g* = @{g} at the points z = m/n (m = 





11 See E. Rothe, Bull. Am. Math. Soc. 45, 1939, pp. 606-613. 





> > gh Chr 








2 deriva- 
€ second 


= 84) 
the type 


ll apply 
erofoil! 
n (Aj). 
tinuous 
ck the 


nv) 


~ 


os 


3 iv 





DIFFERENTIAL EQUATIONS OF BOUNDARY-LAYER PROBLEMS 407 


0,1, °+* , mv). The continuous mapping zm — x», of the (nv + 1)-dimensional 
unit cube 0 S$ a S1,--- ,0 S 2, S 1 hasa fixed point (Brouwer’s Theorem); 
choose one, let it have the coordinates 2}, and set 


Qnw(2) = g(2z; 20, °°, Zny). 


The image gx,» Of gn,» takes on the same values I» 88 Jn,» itself at the points 
z= m/n. Let v be a positive integer. Since ee eS, 








(54) |gno(2) — tm| Se for ~s zs ast (m = 0,1, «+>, nm — 1) 
provided vy 2 » and 1/n S 6(€e, »); in particular (: =™ + ') 


| tm41— tm | Se. 
Hence, because gn,,(z) is the linear interpolation of the values a», , the inequality 
| Gn(2) — am| Se 
holds under the same conditions as (54) and thus 
| gxv(2) — gn»(2) | S 2e for0 Sz < % 


as soon as vy > m and n = 1/8(e, »). This means that gn» — gn.» — 0 with 
nand v tending to infinity. A subsequence” of the gx, € 9s tends to a limit g, 
the corresponding subsequence of the gn,, to the same limit, and, on account of 


the continuity of ®, the relation gx,» = ®{gn»} yields g = ®{g}. 
INSTITUTE FOR ADVANCED StTuDY 


“By a subsequence of the pairs (n, vy) we mean a sequence (nm, %) both members of 
which are monotone: nj < Nix1, vi < Viste 





